From the Center for Human Technologies, Istituto Italiano di Tecnologia, Via Melen 83, Genoa 16152, Italy (C.D.); Wolfson Institute of Population Health, Queen Mary University of London, London, UK (C.D., E.F.L., J.C., A.R.B.); Institute of Computer Science (ICS), Foundation of Research and Technology Hellas, Heraklion, Crete, Greece (G.K.); Joint for Director Breast Screening, University Hospitals Coventry and Warwickshire NHS Trust Coventry, Coventry, UK (M.S.); Department of Oncoplastic Breast Surgery, University Hospitals of Leicester NHS Trust, Leicester, UK (M.A.A.); Consumer member at National Cancer Research Institute, Breast Group, London, UK (J.R., C.P.); and University of Warwick, WMG, Coventry, UK (G.M.).
Radiology. 2023 Jun;307(5):e222679. doi: 10.1148/radiol.222679.
Background Accurate breast cancer risk assessment after a negative screening result could enable better strategies for early detection. Purpose To evaluate a deep learning algorithm for risk assessment based on digital mammograms. Materials and Methods A retrospective observational matched case-control study was designed using the OPTIMAM Mammography Image Database from the National Health Service Breast Screening Programme in the United Kingdom from February 2010 to September 2019. Patients with breast cancer (cases) were diagnosed following a mammographic screening or between two triannual screening rounds. Controls were matched based on mammography device, screening site, and age. The artificial intelligence (AI) model only used mammograms at screening before diagnosis. The primary objective was to assess model performance, with a secondary objective to assess heterogeneity and calibration slope. The area under the receiver operating characteristic curve (AUC) was estimated for 3-year risk. Heterogeneity according to cancer subtype was assessed using a likelihood ratio interaction test. Statistical significance was set at < .05. Results Analysis included patients with screen-detected (median age, 60 years [IQR, 55-65 years]; 2044 female, including 1528 with invasive cancer and 503 with ductal carcinoma in situ [DCIS]) or interval (median age, 59 years [IQR, 53-65 years]; 696 female, including 636 with invasive cancer and 54 with DCIS) breast cancer and 1:1 matched controls, each with a complete set of mammograms at the screening preceding diagnosis. The AI model had an overall AUC of 0.68 (95% CI: 0.66, 0.70), with no evidence of a significant difference between interval and screen-detected (AUC, 0.69 vs 0.67; = .085) cancer. The calibration slope was 1.13 (95% CI: 1.01, 1.26). There was similar performance for the detection of invasive cancer versus DCIS (AUC, 0.68 vs 0.66; = .057). The model had higher performance for advanced cancer risk (AUC, 0.72 ≥stage II vs 0.66 <stage II; = .037). The AUC for detecting breast cancer in mammograms at diagnosis was 0.89 (95% CI: 0.88, 0.91). Conclusion The AI model was found to be a strong predictor of breast cancer risk for 3-6 years following a negative mammographic screening. © RSNA, 2023 . See also the editorial by Mann and Sechopoulos in this issue.
在阴性筛查结果后进行准确的乳腺癌风险评估可以为早期检测制定更好的策略。目的:评估基于数字乳房 X 线照片的深度学习算法进行风险评估。材料和方法:本研究采用英国国家卫生服务乳房筛查计划 2010 年 2 月至 2019 年 9 月期间的 OPTIMAM 乳房 X 线图像数据库进行回顾性观察性匹配病例对照研究。乳腺癌患者(病例)在乳房 X 线筛查后或两次三年筛查期间被诊断出。对照组根据乳房 X 线摄影设备、筛查地点和年龄进行匹配。人工智能(AI)模型仅使用诊断前筛查的乳房 X 线照片。主要目的是评估模型性能,次要目的是评估异质性和校准斜率。接受者操作特征曲线(ROC)下的面积(AUC)用于估计 3 年风险。使用似然比交互检验评估癌症亚型的异质性。统计显著性设定为 <.05。结果:分析包括了经筛检(中位年龄 60 岁[IQR,55-65 岁];2044 名女性,包括 1528 名浸润性癌症和 503 名导管原位癌[DCIS])或间隔(中位年龄 59 岁[IQR,53-65 岁];696 名女性,包括 636 名浸润性癌症和 54 名 DCIS)乳腺癌患者和 1:1 匹配的对照组,每组在诊断前的筛查中均有完整的一套乳房 X 线照片。AI 模型的总体 AUC 为 0.68(95%CI:0.66,0.70),间隔和经筛检癌症之间没有证据表明存在显著差异(AUC,0.69 比 0.67; =.085)。校准斜率为 1.13(95%CI:1.01,1.26)。检测浸润性癌症与 DCIS 的性能相似(AUC,0.68 比 0.66; =.057)。对于高级癌症风险,该模型的性能更高(AUC,0.72≥II 期比 0.66<II 期; =.037)。在诊断时的乳房 X 线照片中检测乳腺癌的 AUC 为 0.89(95%CI:0.88,0.91)。结论:该 AI 模型被发现是阴性乳房 X 线筛查后 3-6 年内乳腺癌风险的有力预测因子。©RSNA,2023。也可参见本期 Mann 和 Sechopoulos 的社论。