Kwon Mi-Ri, Kim Sung Hun, Park Ga Eun, Mun Han Song, Kang Bong Joo, Kim Yun Tae, Yoon Inyoung
Department of Radiology, Kangbuk Samsung Hospital, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea.
Department of Radiology, Graduate School, The Catholic University of Korea, Seoul, Republic of Korea.
Radiol Med. 2025 Jun 20. doi: 10.1007/s11547-025-02033-8.
To evaluate the agreement between artificial intelligence (AI)-based tumor size measurements of breast cancer and the final pathology and compare these results with those of other imaging modalities.
This retrospective study included 925 women (mean age, 55.3 years ± 11.6) with 936 breast cancers, who underwent digital mammography, breast ultrasound, and magnetic resonance imaging before breast cancer surgery. AI-based tumor size measurement was performed on post-processed mammographic images, outlining areas with AI abnormality scores of 10, 50, and 90%. Absolute agreement between AI-based tumor sizes, image modalities, and histopathology was assessed using intraclass correlation coefficient (ICC) analysis. Concordant and discordant cases between AI measurements and histopathologic examinations were compared.
Tumor size with an abnormality score of 50% showed the highest agreement with histopathologic examination (ICC = 0.54, 95% confidential interval [CI]: 0.49-0.59), showing comparable agreement with mammography (ICC = 0.54, 95% CI: 0.48-0.60, p = 0.40). For ductal carcinoma in situ and human epidermal growth factor receptor 2-positive cancers, AI revealed a higher agreement than that of mammography (ICC = 0.76, 95% CI: 0.67-0.84 and ICC = 0.73, 95% CI: 0.52-0.85). Overall, 52.0% (487/936) of cases were discordant, with these cases more commonly observed in younger patients with dense breasts, multifocal malignancies, lower abnormality scores, and different imaging characteristics.
AI-based tumor size measurements with abnormality scores of 50% showed moderate agreement with histopathology but demonstrated size discordance in more than half of the cases. While comparable to mammography, its limitations emphasize the need for further refinement and research.
评估基于人工智能(AI)的乳腺癌肿瘤大小测量结果与最终病理结果之间的一致性,并将这些结果与其他成像模态的结果进行比较。
这项回顾性研究纳入了925名患有936例乳腺癌的女性(平均年龄55.3岁±11.6岁),她们在乳腺癌手术前接受了数字化乳腺X线摄影、乳腺超声和磁共振成像检查。基于AI的肿瘤大小测量在处理后的乳腺X线摄影图像上进行,勾勒出AI异常评分分别为10%、50%和90%的区域。使用组内相关系数(ICC)分析评估基于AI的肿瘤大小、成像模态和组织病理学之间的绝对一致性。比较AI测量结果与组织病理学检查结果一致和不一致的病例。
异常评分为50%时的肿瘤大小与组织病理学检查显示出最高的一致性(ICC = 0.54,95%置信区间[CI]:0.49 - 0.59),与乳腺X线摄影的一致性相当(ICC = 0.54,95% CI:0.48 - 0.60,p = 0.40)。对于原位导管癌和人表皮生长因子受体2阳性癌症,AI显示出比乳腺X线摄影更高的一致性(ICC = 0.76,95% CI:0.67 - 0.84和ICC = 0.73,95% CI:0.52 - 0.85)。总体而言,52.0%(487/936)的病例存在不一致,这些病例在乳房致密的年轻患者、多灶性恶性肿瘤患者、异常评分较低以及具有不同成像特征的患者中更常见。
异常评分为50%的基于AI的肿瘤大小测量与组织病理学显示出中等程度的一致性,但在超过一半的病例中显示出大小不一致。虽然与乳腺X线摄影相当,但其局限性强调了进一步改进和研究的必要性。