National Center for Health Statistics, Centers for Disease Control and Prevention, Hyattsville, MD 20782, USA.
Stat Med. 2011 May 20;30(11):1302-11. doi: 10.1002/sim.4219. Epub 2011 Mar 22.
Life expectancy is an important measure for health research and policymaking. Linking individual survey records to mortality data can overcome limitations in vital statistics data used to examine differential mortality by permitting the construction of death rates based on information collected from respondents at the time of interview and facilitating estimation of life expectancies for subgroups of interest. However, use of complex survey data linked to mortality data can complicate the estimation of standard errors. This paper presents a case study of approaches to variance estimation for life expectancies based on life tables, using the National Health Interview Survey Linked Mortality Files. The approaches considered include application of Chiang's traditional method, which is straightforward but does not account for the complex design features of the data; balanced repeated replication (BRR), which is more complicated but accounts more fully for the design features; and compromise, 'hybrid' approaches, which can be less difficult to implement than BRR but still account partially for the design features. Two tentative conclusions are drawn. First, it is important to account for the effects of the complex sample design, at least within life-table age intervals. Second, accounting for the effects within age intervals but not across age intervals, as is done by the hybrid methods, can yield reasonably accurate estimates of standard errors, especially for subgroups of interest with more homogeneous characteristics among their members.
预期寿命是健康研究和政策制定的重要衡量标准。将个人调查记录与死亡率数据联系起来,可以克服使用生命统计数据来检查死亡率差异的局限性,因为这允许根据受访者在访谈时收集的信息构建死亡率,并促进对感兴趣的子组的预期寿命进行估计。然而,使用与死亡率数据相关的复杂调查数据可能会使标准误差的估计变得复杂。本文通过使用国家健康访谈调查链接死亡率文件,对基于生命表的预期寿命的方差估计方法进行了案例研究。所考虑的方法包括应用 Chiang 的传统方法,该方法简单直接,但没有考虑到数据的复杂设计特征;平衡重复复制(BRR),该方法更复杂,但更充分地考虑了设计特征;以及妥协的“混合”方法,该方法比 BRR 实施起来更容易,但仍部分考虑了设计特征。得出了两个初步结论。首先,至少在生命表年龄区间内,考虑复杂样本设计的影响非常重要。其次,如混合方法所做的那样,在年龄区间内而不是在年龄区间之间考虑效果,可以对标准误差进行合理准确的估计,尤其是对于成员之间具有更多同质特征的感兴趣的子组。