Comparative Health Outcomes, Policy, and Economics (CHOICE) Institute, University of Washington, Seattle, WA.
Department of Biostatistics, University of Washington, Seattle, WA.
JCO Clin Cancer Inform. 2023 Jun;7:e2300004. doi: 10.1200/CCI.23.00004.
There is growing interest in using computable phenotypes or proxies to identify important clinical outcomes, such as cancer recurrence, in rich electronic health records data. However, the race/ethnicity-specific accuracies of these proxies remain unclear. We examined whether the accuracy of a proxy for colorectal cancer (CRC) recurrence differed by race/ethnicity and the possible mechanisms that drove the differences.
Using data from a large integrated health care system, we identified a stratified random sample of 282 Black/African American (AA), Hispanic, and non-Hispanic White (NHW) patients with CRC who received primary treatment. Patient 5-year recurrence status was estimated using a utilization-based proxy and evaluated against the true recurrence status obtained using detailed chart review and by race/ethnicity. We used covariate-adjusted probit regression models to estimate the associations between race/ethnicity and misclassification.
The recurrence proxy had excellent overall accuracy (positive predictive value [PPV] 89.4%; negative predictive value 96.5%; mean difference in timing 1.96 months); however, accuracy varied by race/ethnicity. Compared with NHW patients, PPV was 14.9% lower (95% CI, 2.53 to 28.6) among Hispanic patients and 4.3% lower (95% CI, -4.8 to 14.8) among Black/AA patients. The proxy disproportionately inflated the 5-year recurrence incidence for Hispanic patients by 10.6% (95% CI, 4.2 to 18.2). Compared with NHW patients, proxy recurrences for Hispanic patients were almost three times as likely to have been misclassified as positive (adjusted risk ratio 2.91 [95% CI, 1.21 to 8.31]). Higher false positives among racial/ethnic minorities may be related to higher prevalence of noncancerous lung-related problems and substantial delays in primary treatment because of insufficient patient-provider communication and abnormal treatment patterns.
Using a proxy with worse accuracy among racial/ethnic minority patients to estimate population health may misdirect resources and support erroneous conclusions around treatment benefit for these patients.
使用可计算的表型或代理来识别重要的临床结果,如癌症复发,在丰富的电子健康记录数据中越来越受到关注。然而,这些代理在种族/民族特异性方面的准确性尚不清楚。我们研究了大肠癌 (CRC) 复发代理的准确性是否因种族/民族而异,以及可能导致差异的潜在机制。
利用来自大型综合医疗保健系统的数据,我们从接受主要治疗的 CRC 患者中确定了一个分层随机的 282 名黑/非裔美国人 (AA)、西班牙裔和非西班牙裔白人 (NHW) 患者的样本。使用基于利用的代理来估计患者 5 年的复发情况,并通过种族/民族与详细的图表审查和实际复发情况进行评估。我们使用调整后的协变量概率回归模型来估计种族/民族与分类错误之间的关联。
该复发代理具有出色的整体准确性(阳性预测值 [PPV] 为 89.4%;阴性预测值 96.5%;平均时间差异为 1.96 个月);然而,准确性因种族/民族而异。与 NHW 患者相比,西班牙裔患者的 PPV 低 14.9%(95%置信区间,2.53 至 28.6),黑/AA 患者低 4.3%(95%置信区间,-4.8 至 14.8)。该代理不成比例地将西班牙裔患者的 5 年复发发生率提高了 10.6%(95%置信区间,4.2 至 18.2)。与 NHW 患者相比,西班牙裔患者的代理复发更有可能被错误地归类为阳性(调整后的风险比 2.91 [95%置信区间,1.21 至 8.31])。少数族裔患者的假阳性率较高可能与非癌症性肺部相关问题的发生率较高以及由于医患沟通不足和治疗模式异常导致主要治疗延迟有关。
在种族/民族少数群体患者中使用准确性较差的代理来估计人群健康可能会误导资源,并支持对这些患者治疗益处的错误结论。