Rogot E, Sorlie P, Johnson N J
J Chronic Dis. 1986;39(9):719-34. doi: 10.1016/0021-9681(86)90155-4.
The National Death Index (NDI) of the National Center for Health Statistics is a powerful tool for identifying deaths in epidemiologic studies. The NDI will generate a list of possible matches for every input record according to the NDI matching criteria. The task of determining a true or correct match out of the list of possible matches becomes formidable when a large number of records are being investigated. In the National Longitudinal Mortality Study nearly one million Census records are being matched to the NDI, thus requiring an efficient and accurate method to screen out the false positive matches. In a pilot study to the larger mortality follow-up, Census Bureau files containing 226,000 person records were matched to the 1979 NDI. The results of this match were used to generate a probabilistic method to separate the possible matches into categories of true positives, false positives and those of questionable status requiring manual review of the Census record and the death certificate. Of the 5542 possible matches about one-third were ultimately determined to be true positives and two-thirds false positives. The probabilistic method was validated by replications on subsets of the data and promises to save considerable time in review of records in the large national study of mortality.
美国国家卫生统计中心的国家死亡索引(NDI)是流行病学研究中识别死亡情况的有力工具。NDI会根据NDI匹配标准为每条输入记录生成一份可能匹配项列表。当调查大量记录时,从这些可能的匹配项列表中确定真正或正确的匹配项的任务变得艰巨。在全国纵向死亡率研究中,近100万人口普查记录要与NDI进行匹配,因此需要一种高效且准确的方法来筛选出误报匹配项。在一项针对更大规模死亡率随访的试点研究中,将包含226,000个人记录的人口普查局档案与1979年的NDI进行了匹配。此次匹配结果被用于生成一种概率方法,以将可能的匹配项分为真阳性、假阳性以及需要人工查阅人口普查记录和死亡证明的可疑状态类别。在5542个可能的匹配项中,最终约三分之一被确定为真阳性,三分之二为假阳性。该概率方法通过对数据子集的重复验证,并有望在大型全国死亡率研究的记录审查中节省大量时间。