Lee Kyung Eun, Song Sung Eun, Cho Kyu Ran, Bae Min Sun, Seo Bo Kyoung, Kim Soo-Yeon, Woo Ok Hee
Department of Radiology, Korea University Guro Hospital, Korea University College of Medicine, Seoul, Republic of Korea.
Department of Radiology, Korea University Anam Hospital, Korea University College of Medicine, Seoul, Republic of Korea.
Korean J Radiol. 2025 Mar;26(3):217-229. doi: 10.3348/kjr.2024.0664.
To test the performance of an artificial intelligence-based computer-aided diagnosis (AI-CAD) designed for full-field digital mammography (FFDM) when applied to synthetic mammography (SM).
We analyzed 501 women (mean age, 57 ± 11 years) who underwent preoperative mammography and breast cancer surgery. This cohort consisted of 1002 breasts, comprising 517 with cancer and 485 without. All patients underwent digital breast tomosynthesis (DBT) and FFDM during the preoperative workup. The SM is routinely reconstructed using DBT. Commercial AI-CAD (Lunit Insight MMG, version 1.1.7.2) was retrospectively applied to SM and FFDM to calculate the abnormality scores for each breast. The median abnormality scores were compared for the 517 breasts with cancer using the Wilcoxon signed-rank test. Calibration curves of abnormality scores were evaluated. The discrimination performance was analyzed using the area under the receiver operating characteristic curve (AUC), sensitivity, and specificity using a 10% preset threshold. Sensitivity and specificity were further analyzed according to the mammographic and pathological characteristics. The results of SM and FFDM were compared.
AI-CAD demonstrated a significantly lower median abnormality score (71% vs. 96%, < 0.001) and poorer calibration performance for SM than for FFDM. SM exhibited lower sensitivity (76.2% vs. 82.8%, < 0.001), higher specificity (95.5% vs. 91.8%, < 0.001), and comparable AUC (0.86 vs. 0.87, = 0.127) than FFDM. SM showed lower sensitivity than FFDM in asymptomatic breasts, dense breasts, ductal carcinoma in situ, T1, N0, and hormone receptor-positive/human epidermal growth factor receptor 2-negative cancers but showed higher specificity in non-cancerous dense breasts.
AI-CAD showed lower abnormality scores and reduced calibration performance for SM than for FFDM. Furthermore, the 10% preset threshold resulted in different discrimination performances for the SM. Given these limitations, off-label application of the current AI-CAD to SM should be avoided.
测试一款为全视野数字乳腺摄影(FFDM)设计的基于人工智能的计算机辅助诊断(AI-CAD)应用于合成乳腺摄影(SM)时的性能。
我们分析了501名接受术前乳腺摄影和乳腺癌手术的女性(平均年龄57±11岁)。该队列包括1002个乳房,其中517个患有癌症,485个未患癌症。所有患者在术前检查期间均接受了数字乳腺断层合成(DBT)和FFDM。SM通常使用DBT重建。将商用AI-CAD(Lunit Insight MMG,版本1.1.7.2)回顾性应用于SM和FFDM,以计算每个乳房的异常评分。使用Wilcoxon符号秩检验比较517个患癌乳房的中位异常评分。评估异常评分的校准曲线。使用受试者操作特征曲线(AUC)下的面积、灵敏度和特异性(使用10%的预设阈值)分析鉴别性能。根据乳腺摄影和病理特征进一步分析灵敏度和特异性。比较SM和FFDM的结果。
与FFDM相比,AI-CAD在SM上显示出显著更低的中位异常评分(71%对96%,<0.001)和更差的校准性能。SM的灵敏度较低(76.2%对82.8%,<0.001),特异性较高(95.5%对91.8%,<0.001),AUC相当(0.86对0.87,=0.127)。在无症状乳房、致密乳房、原位导管癌、T1、N0以及激素受体阳性/人表皮生长因子受体2阴性癌症中,SM的灵敏度低于FFDM,但在非癌性致密乳房中显示出更高的特异性。
与FFDM相比,AI-CAD在SM上显示出更低的异常评分和降低的校准性能。此外,10%的预设阈值导致SM的鉴别性能不同。鉴于这些局限性,应避免将当前的AI-CAD超说明书应用于SM。