Clegg Limin X, Reichman Marsha E, Hankey Benjamin F, Miller Barry A, Lin Yi D, Johnson Norman J, Schwartz Stephen M, Bernstein Leslie, Chen Vivien W, Goodman Marc T, Gomez Scarlett L, Graff John J, Lynch Charles F, Lin Charles C, Edwards Brenda K
Office of Healthcare Inspections, Office of Inspector General, US Department of Veterans Affairs, Washington, DC 20001, USA.
Cancer Causes Control. 2007 Mar;18(2):177-87. doi: 10.1007/s10552-006-0089-4. Epub 2007 Jan 11.
Population-based cancer registry data from the Surveillance, Epidemiology, and End Results (SEER) Program at the National Cancer Institute are based on medical records and administrative information. Although SEER data have been used extensively in health disparities research, the quality of information concerning race, Hispanic ethnicity, and immigrant status has not been systematically evaluated. The quality of this information was determined by comparing SEER data with self-reported data among 13,538 cancer patients diagnosed between 1973-2001 in the SEER--National Longitudinal Mortality Study linked database. The overall agreement was excellent on race (kappa = 0.90, 95% CI = 0.88-0.91), moderate to substantial on Hispanic ethnicity (kappa = 0.61, 95% CI = 0.58-0.64), and low on immigrant status (kappa = 0.21. 95% CI = 0.10, 0.23). The effect of these disagreements was that SEER data tended to under-classify patient numbers when compared to self-identifications, except for the non-Hispanic group which was slightly over-classified. These disagreements translated into varying racial-, ethnic-, and immigrant status-specific cancer statistics, depending on whether self-reported or SEER data were used. In particular, the 5-year Kaplan-Meier survival and the median survival time from all causes for American Indians/Alaska Natives were substantially higher when based on self-classification (59% and 140 months, respectively) than when based on SEER classification (44% and 53 months, respectively), although the number of patients is small. These results can serve as a useful guide to researchers contemplating the use of population-based registry data to ascertain disparities in cancer burden. In particular, the study results caution against evaluating health disparities by using birthplace as a measure of immigrant status and race information for American Indians/Alaska Natives.
美国国立癌症研究所监测、流行病学和最终结果(SEER)项目基于人群的癌症登记数据来源于医疗记录和行政信息。尽管SEER数据已广泛应用于健康差异研究,但有关种族、西班牙裔族裔和移民身份的信息质量尚未得到系统评估。通过将SEER数据与1973年至2001年在SEER-全国纵向死亡率研究关联数据库中诊断出的13538名癌症患者的自我报告数据进行比较,确定了这些信息的质量。种族方面的总体一致性极佳(kappa = 0.90,95% CI = 0.88 - 0.91),西班牙裔族裔方面为中度到高度一致(kappa = 0.61,95% CI = 0.58 - 0.64),移民身份方面一致性较低(kappa = 0.21,95% CI = 0.10,0.23)。这些不一致的影响是,与自我认定相比,SEER数据往往会对患者数量进行低分类,非西班牙裔群体除外,该群体被略微高分类。这些不一致转化为不同的种族、族裔和移民身份特定的癌症统计数据,具体取决于使用的是自我报告数据还是SEER数据。特别是,美国印第安人/阿拉斯加原住民基于自我分类的5年Kaplan-Meier生存率和所有原因导致的中位生存时间(分别为59%和140个月)显著高于基于SEER分类的情况(分别为44%和53个月),尽管患者数量较少。这些结果可为考虑使用基于人群的登记数据来确定癌症负担差异的研究人员提供有用指导。特别是,研究结果提醒不要以出生地作为美国印第安人/阿拉斯加原住民移民身份和种族信息的衡量标准来评估健康差异。