Department of Medicine, Vagelos College of Physicians and Surgeons, Columbia University, New York, NY, USA.
Hematology-Oncology, 177 Fort Washington Avenue, New York, NY, 10032, USA.
Breast Cancer Res Treat. 2023 Jul;200(2):237-245. doi: 10.1007/s10549-023-06966-4. Epub 2023 May 20.
Deep learning techniques, including convolutional neural networks (CNN), have the potential to improve breast cancer risk prediction compared to traditional risk models. We assessed whether combining a CNN-based mammographic evaluation with clinical factors in the Breast Cancer Surveillance Consortium (BCSC) model improved risk prediction.
We conducted a retrospective cohort study among 23,467 women, age 35-74, undergoing screening mammography (2014-2018). We extracted electronic health record (EHR) data on risk factors. We identified 121 women who subsequently developed invasive breast cancer at least 1 year after the baseline mammogram. Mammograms were analyzed with a pixel-wise mammographic evaluation using CNN architecture. We used logistic regression models with breast cancer incidence as the outcome and predictors including clinical factors only (BCSC model) or combined with CNN risk score (hybrid model). We compared model prediction performance via area under the receiver operating characteristics curves (AUCs).
Mean age was 55.9 years (SD, 9.5) with 9.3% non-Hispanic Black and 36% Hispanic. Our hybrid model did not significantly improve risk prediction compared to the BCSC model (AUC of 0.654 vs 0.624, respectively, p = 0.063). In subgroup analyses, the hybrid model outperformed the BCSC model among non-Hispanic Blacks (AUC 0.845 vs. 0.589; p = 0.026) and Hispanics (AUC 0.650 vs 0.595; p = 0.049).
We aimed to develop an efficient breast cancer risk assessment method using CNN risk score and clinical factors from the EHR. With future validation in a larger cohort, our CNN model combined with clinical factors may help predict breast cancer risk in a cohort of racially/ethnically diverse women undergoing screening.
深度学习技术,包括卷积神经网络(CNN),有可能比传统风险模型提高乳腺癌风险预测的准确性。我们评估了在乳腺癌监测联盟(BCSC)模型中结合基于 CNN 的乳腺评估与临床因素是否能改善风险预测。
我们对 23467 名年龄在 35-74 岁之间接受筛查性乳腺 X 线摄影(2014-2018 年)的女性进行了回顾性队列研究。我们从电子病历(EHR)中提取了风险因素的数据。我们确定了 121 名在基线乳腺 X 线摄影后至少 1 年内发生浸润性乳腺癌的女性。使用基于 CNN 架构的逐像素乳腺评估分析乳腺 X 线片。我们使用逻辑回归模型,以乳腺癌发病作为结局,预测因子包括仅临床因素(BCSC 模型)或结合 CNN 风险评分(混合模型)。我们通过接收者操作特征曲线下的面积(AUC)比较模型的预测性能。
平均年龄为 55.9 岁(标准差 9.5),9.3%为非西班牙裔黑人,36%为西班牙裔。与 BCSC 模型相比,我们的混合模型在风险预测方面并没有显著提高(AUC 分别为 0.654 和 0.624,p=0.063)。在亚组分析中,混合模型在非西班牙裔黑人(AUC 0.845 与 0.589;p=0.026)和西班牙裔(AUC 0.650 与 0.595;p=0.049)中优于 BCSC 模型。
我们旨在利用来自 EHR 的 CNN 风险评分和临床因素开发一种有效的乳腺癌风险评估方法。在未来更大的队列中进行验证后,我们的 CNN 模型与临床因素相结合,可能有助于预测接受筛查的种族/民族多样化女性的乳腺癌风险。