Deshpande Anjali D, Schootman Mario, Mayer Allese
Division of General Medical Sciences, Department of Medicine, School of Medicine, Washington University in St. Louis, St. Louis, MO.
Department of Epidemiology, College for Public Health and Social Justice, Saint Louis University, St. Louis, MO.
Ann Epidemiol. 2015 Apr;25(4):297-300. doi: 10.1016/j.annepidem.2015.01.005. Epub 2015 Jan 16.
To examine the validity of claims data to identify colorectal cancer (CRC) recurrence and determine the extent to which misclassification of recurrence status affects estimates of its association with overall survival in a population-based administrative database.
We calculated the accuracy of claims data relative to medical records from one large tertiary hospital to identify CRC recurrence. We estimated the effect of misclassifying recurrence on survival by applying these findings to the linked Surveillance, Epidemiology, and End Results-Medicare data.
Of 174 eligible CRC patients identified through medical records, 32 (18.4%) had a recurrence. A claims-based algorithm of secondary malignancy codes yielded a sensitivity of 81% and specificity of 99% for identifying recurrence. Agreement between data sources was almost perfect (kappa: 0.86). In a model unadjusted for misclassification, CRC patients with recurrence were 3.04 times (95% confidence interval: 2.92-3.17) more likely to die of any cause than those without recurrence. In the corrected model, CRC patients with recurrence were 3.47 times (95% confidence interval: 3.06-4.14) more likely to die than those without recurrence.
Identifying recurrence in CRC patients using claims data is feasible with moderate sensitivity and high specificity. Future studies can use this algorithm with Surveillance, Epidemiology, and End Results-Medicare data to study treatment patterns and outcomes of CRC patients with recurrence.
在一个基于人群的行政数据库中,检验索赔数据用于识别结直肠癌(CRC)复发的有效性,并确定复发状态的错误分类对其与总生存关联估计值的影响程度。
我们计算了相对于一家大型三级医院病历的索赔数据识别CRC复发的准确性。通过将这些结果应用于关联的监测、流行病学和最终结果-医疗保险数据,我们估计了复发错误分类对生存的影响。
通过病历识别出的174例符合条件的CRC患者中,32例(18.4%)出现复发。基于索赔的继发性恶性肿瘤编码算法识别复发的敏感性为81%,特异性为99%。数据源之间的一致性几乎完美(kappa值:0.86)。在未校正错误分类的模型中,复发的CRC患者死于任何原因的可能性是未复发患者的3.04倍(95%置信区间:2.92 - 3.17)。在校正后的模型中,复发的CRC患者死亡的可能性是未复发患者的3.47倍(95%置信区间:3.06 - 4.14)。
使用索赔数据识别CRC患者的复发具有中等敏感性和高特异性,是可行的。未来的研究可以将该算法与监测、流行病学和最终结果-医疗保险数据一起用于研究复发CRC患者的治疗模式和结局。