Martiniussen Marit A, Larsen Marthe, Hovda Tone, Kristiansen Merete U, Dahl Fredrik A, Eikvil Line, Brautaset Olav, Bjørnerud Atle, Kristensen Vessela, Bergan Marie B, Hofvind Solveig
Department of Radiology, Østfold Hospital Trust, Kalnes, Norway.
University of Oslo, Institute of Clinical Medicine, Oslo, Norway.
Radiol Artif Intell. 2025 May;7(3):e240039. doi: 10.1148/ryai.240039.
Purpose To evaluate cancer detection and marker placement accuracy of two artificial intelligence (AI) models developed for interpretation of screening mammograms. Materials and Methods This retrospective study included data from 129 434 screening examinations (all female patients; mean age, 59.2 years ± 5.8 [SD]) performed between January 2008 and December 2018 in BreastScreen Norway. Model A was commercially available and model B was an in-house model. Area under the receiver operating characteristic curve (AUC) with 95% CIs were calculated. The study defined 3.2% and 11.1% of the examinations with the highest AI scores as positive, threshold 1 and 2, respectively. A radiologic review assessed location of AI markings and classified interval cancers as true or false negative. Results The AUC value was 0.93 (95% CI: 0.92, 0.94) for model A and B when including screen-detected and interval cancers. Model A identified 82.5% (611 of 741) of the screen-detected cancers at threshold 1 and 92.4% (685 of 741) at threshold 2. Model B identified 81.8% (606 of 741) at threshold 1 and 93.7% (694 of 741) at threshold 2. The AI markings were correctly localized for all screen-detected cancers identified by both models and 82% (56 of 68) of the interval cancers for model A and 79% (54 of 68) for model B. At the review, 21.6% (45 of 208) of the interval cancers were identified at the preceding screening by either or both models, correctly localized and classified as false negative ( = 17) or with minimal signs of malignancy ( = 28). Conclusion Both AI models showed promising performance for cancer detection on screening mammograms. The AI markings corresponded well to the true cancer locations. Breast, Mammography, Screening, Computed-aided Diagnosis © RSNA, 2025 See also commentary by Cadrin-Chênevert in this issue.
目的 评估为解读乳腺钼靶筛查影像而开发的两个人工智能(AI)模型的癌症检测及标记放置准确性。材料与方法 这项回顾性研究纳入了2008年1月至2018年12月在挪威乳腺筛查项目中进行的129434例筛查检查的数据(所有女性患者;平均年龄59.2岁±5.8[标准差])。模型A为市售模型,模型B为内部模型。计算了受试者操作特征曲线(AUC)下面积及95%置信区间(CI)。该研究将AI得分最高的3.2%和11.1%的检查分别定义为阈值1和阈值2下的阳性检查。放射学评估确定了AI标记的位置,并将间期癌分类为真阴性或假阴性。结果 当纳入筛查发现的癌症和间期癌时,模型A和模型B的AUC值均为0.93(95%CI:0.92,0.94)。模型A在阈值1时识别出82.5%(741例中的611例)的筛查发现癌症,在阈值2时识别出92.4%(741例中的685例)。模型B在阈值1时识别出81.8%(741例中的606例),在阈值2时识别出93.7%(741例中的694例)。对于两个模型识别出的所有筛查发现癌症以及模型A的82%(68例中的56例)和模型B的79%(68例中的54例)间期癌,AI标记定位正确。在评估时,21.6%(208例中的45例)的间期癌在之前的筛查中被一个或两个模型识别,定位正确并被分类为假阴性(n = 17)或有微小恶性征象(n = 28)。结论 两个人工智能模型在乳腺钼靶筛查中检测癌症均表现出良好的性能。AI标记与真实癌症位置对应良好。乳腺、钼靶摄影、筛查、计算机辅助诊断 © RSNA,2025 另见本期Cadrin-Chênevert的评论。