Dhungana Asim, Vannier Augustin, Zhao Fangyuan, Freeman Jincong Q, Saha Poornima, Sullivan Megan, Yao Katharine, Flores Elbio M, Olopade Olufunmilayo I, Pearson Alexander T, Huo Dezheng, Howard Frederick M
Pritzker School of Medicine, University of Chicago, Chicago, IL, USA.
Department of Public Health Sciences, University of Chicago, Chicago, IL, USA.
NPJ Breast Cancer. 2024 Jun 15;10(1):46. doi: 10.1038/s41523-024-00651-5.
Given high costs of Oncotype DX (ODX) testing, widely used in recurrence risk assessment for early-stage breast cancer, studies have predicted ODX using quantitative clinicopathologic variables. However, such models have incorporated only small cohorts. Using a cohort of patients from the National Cancer Database (NCDB, n = 53,346), we trained machine learning models to predict low-risk (0-25) or high-risk (26-100) ODX using quantitative estrogen receptor (ER)/progesterone receptor (PR)/Ki-67 status, quantitative ER/PR status alone, and no quantitative features. Models were externally validated on a diverse cohort of 970 patients (median follow-up 55 months) for accuracy in ODX prediction and recurrence. Comparing the area under the receiver operating characteristic curve (AUROC) in a held-out set from NCDB, models incorporating quantitative ER/PR (AUROC 0.78, 95% CI 0.77-0.80) and ER/PR/Ki-67 (AUROC 0.81, 95% CI 0.80-0.83) outperformed the non-quantitative model (AUROC 0.70, 95% CI 0.68-0.72). These results were preserved in the validation cohort, where the ER/PR/Ki-67 model (AUROC 0.87, 95% CI 0.81-0.93, p = 0.009) and the ER/PR model (AUROC 0.86, 95% CI 0.80-0.92, p = 0.031) significantly outperformed the non-quantitative model (AUROC 0.80, 95% CI 0.73-0.87). Using a high-sensitivity rule-out threshold, the non-quantitative, quantitative ER/PR and ER/PR/Ki-67 models identified 35%, 30% and 43% of patients as low-risk in the validation cohort. Of these low-risk patients, fewer than 3% had a recurrence at 5 years. These models may help identify patients who can forgo genomic testing and initiate endocrine therapy alone. An online calculator is provided for further study.
鉴于广泛用于早期乳腺癌复发风险评估的Oncotype DX(ODX)检测成本高昂,已有研究利用定量临床病理变量来预测ODX。然而,此类模型纳入的队列规模较小。我们使用来自国家癌症数据库(NCDB,n = 53346)的患者队列,训练机器学习模型,以利用定量雌激素受体(ER)/孕激素受体(PR)/Ki-67状态、单独的定量ER/PR状态以及无定量特征来预测低风险(0 - 25)或高风险(26 - 100)的ODX。在一个由970名患者组成的多样化队列(中位随访时间55个月)中对模型进行外部验证,以评估ODX预测和复发的准确性。比较NCDB留存数据集中的受试者工作特征曲线下面积(AUROC),纳入定量ER/PR(AUROC 0.78,95% CI 0.77 - 0.80)和ER/PR/Ki-67(AUROC 0.81,95% CI 0.80 - 0.83)的模型优于非定量模型(AUROC 0.70,95% CI 0.68 - 0.72)。这些结果在验证队列中得到了验证,其中ER/PR/Ki-67模型(AUROC 0.87,95% CI 0.81 - 0.93,p = 0.009)和ER/PR模型(AUROC 0.86,95% CI 0.80 - 0.92,p = 0.031)显著优于非定量模型(AUROC 0.80,95% CI 0.73 - 0.87)。使用高灵敏度排除阈值,非定量、定量ER/PR和ER/PR/Ki-67模型在验证队列中分别将35%、30%和43%的患者识别为低风险。在这些低风险患者中,5年内复发的患者不到3%。这些模型可能有助于识别那些可以放弃基因检测并单独开始内分泌治疗的患者。提供了一个在线计算器以供进一步研究。