From the Breast Cancer Unit, Department of Radiology, Hospital Universitario Reina Sofía, Av Menéndez Pidal s/n, Córdoba 14004, Spain (S.R., E.E., J.L.R., M.Á.); Maimonides Institute for Biomedical Research of Córdoba, Córdoba, Spain (S.R., E.E., J.L.R., M.Á.); and Department of Clinical Science, ScreenPoint Medical, Nijmegen, the Netherlands (A.G., A.R.).
Radiology. 2022 Mar;302(3):535-542. doi: 10.1148/radiol.211590. Epub 2021 Dec 14.
Background Use of artificial intelligence (AI) as a stand-alone reader for digital mammography (DM) or digital breast tomosynthesis (DBT) breast screening could ease radiologists' workload while maintaining quality. Purpose To retrospectively evaluate the stand-alone performance of an AI system as an independent reader of DM and DBT screening examinations. Materials and Methods Consecutive screening-paired and independently read DM and DBT images acquired between January 2015 and December 2016 were retrospectively collected from the Tomosynthesis Cordoba Screening Trial. An AI system computed a cancer risk score (range, 1-100) for DM and DBT examinations independently. AI stand-alone performance was measured using the area under the receiver operating characteristic curve (AUC) and sensitivity and recall rate at different operating points selected to have noninferior sensitivity compared with the human readings (noninferiority margin, 5%). The recall rate of AI and the human readings were compared using a McNemar test. Results A total of 15 999 DM and DBT examinations (113 breast cancers, including 98 screen-detected and 15 interval cancers) from 15 998 women (mean age, 58 years ± 6 [standard deviation]) were evaluated. AI achieved an AUC of 0.93 (95% CI: 0.89, 0.96) for DM and 0.94 (95% CI: 0.91, 0.97) for DBT. For DM, AI achieved noninferior sensitivity as a single (58.4%; 66 of 113; 95% CI: 49.2, 67.1) or double (67.3%; 76 of 113; 95% CI: 58.2, 75.2) reader, with a reduction in recall rate ( < .001) of up to 2% (95% CI: -2.4, -1.6). For DBT, AI achieved noninferior sensitivity as a single (77%; 87 of 113; 95% CI: 68.4, 83.8) or double (81.4%; 92 of 113; 95% CI: 73.3, 87.5) reader, but with a higher recall rate ( < .001) of up to 12.3% (95% CI: 11.7, 12.9). Conclusion Artificial intelligence could replace radiologists' readings in breast screening, achieving a noninferior sensitivity, with a lower recall rate for digital mammography but a higher recall rate for digital breast tomosynthesis. Published under a CC BY 4.0 license. See also the editorial by Fuchsjäger and Adelsmayr in this issue.
背景 使用人工智能(AI)作为数字乳腺摄影术(DM)或数字乳腺断层合成术(DBT)乳腺筛查的独立阅读器可以减轻放射科医生的工作量,同时保持质量。
目的 回顾性评估 AI 系统作为 DM 和 DBT 筛查检查的独立阅读器的独立性能。
材料和方法 从 2015 年 1 月至 2016 年 12 月的 Tomosynthesis Cordoba 筛查试验中回顾性收集连续筛查配对和独立阅读的 DM 和 DBT 图像。AI 系统为 DM 和 DBT 检查计算了癌症风险评分(范围,1-100)。使用接收者操作特征曲线(AUC)下的面积和在不同的操作点处选择的灵敏度和召回率来测量 AI 的独立性能,以获得与人类读数相比非劣效性(非劣效性边界,5%)。使用 McNemar 检验比较 AI 和人类读数的召回率。
结果 共评估了来自 15998 名女性(平均年龄,58 岁±6[标准差])的 15999 次 DM 和 DBT 检查(113 例乳腺癌,包括 98 例筛查发现的癌症和 15 例间隔期癌症)。AI 对 DM 的 AUC 为 0.93(95%CI:0.89,0.96),对 DBT 的 AUC 为 0.94(95%CI:0.91,0.97)。对于 DM,AI 作为单个(58.4%;113 例中的 66 例;95%CI:49.2,67.1)或双(67.3%;113 例中的 76 例;95%CI:58.2,75.2)阅读器具有非劣效性灵敏度,同时召回率降低(<.001),最高可达 2%(95%CI:-2.4,-1.6)。对于 DBT,AI 作为单个(77%;113 例中的 87 例;95%CI:68.4,83.8)或双(81.4%;113 例中的 92 例;95%CI:73.3,87.5)阅读器具有非劣效性灵敏度,但召回率更高(<.001),最高可达 12.3%(95%CI:11.7,12.9)。
结论 AI 可以替代乳腺筛查中的放射科医生阅读,达到非劣效性的灵敏度,同时降低数字乳腺摄影术的召回率,但提高数字乳腺断层合成术的召回率。在 CC BY 4.0 许可下发布。还请参阅本期 Fuchsjäger 和 Adelsmayr 的社论。