Abdullah Kamarul Amin, Marziali Sara, Nanaa Muzna, Escudero Sánchez Lorena, Payne Nicholas R, Gilbert Fiona J
Department of Radiology, University of Cambridge School of Clinical Medicine, Cambridge Biomedical Campus, Cambridge, UK.
Universiti Sultan Zainal Abidin, Terengganu, Malaysia.
Eur Radiol. 2025 Feb 5. doi: 10.1007/s00330-025-11406-6.
The aim of this work is to evaluate the performance of deep learning (DL) models for breast cancer diagnosis with MRI.
A literature search was conducted on Web of Science, PubMed, and IEEE Xplore for relevant studies published from January 2015 to February 2024. The study was registered with the PROSPERO International Prospective Register of Systematic Reviews (protocol no. CRD42024485371). The quality assessment of diagnostic accuracy studies-2 (QUADAS2) tool and the Must AI Criteria-10 (MAIC-10) checklist were used to assess quality and risk of bias. The meta-analysis included studies reporting DL for breast cancer diagnosis and their performance, from which pooled summary estimates for the area under the curve (AUC), sensitivity, and specificity were calculated.
A total of 40 studies were included, of which only 21 were eligible for quantitative analysis. Convolutional neural networks (CNNs) were used in 62.5% (25/40) of the implemented models, with the remaining 37.5% (15/40) hybrid composite models (HCMs). The pooled estimates of AUC, sensitivity, and specificity were 0.90 (95% CI: 0.87, 0.93), 88% (95% CI: 86, 91%), and 90% (95% CI: 87, 93%), respectively.
DL models used for breast cancer diagnosis on MRI achieve high performance. However, there is considerable inherent variability in this analysis. Therefore, continuous evaluation and refinement of DL models is essential to ensure their practicality in the clinical setting.
Question Can DL models improve diagnostic accuracy in breast MRI, addressing challenges like overfitting and heterogeneity in study designs and imaging sequences? Findings DL achieved high diagnostic accuracy (AUC 0.90, sensitivity 88%, specificity 90%) in breast MRI, with training size significantly impacting performance metrics (p < 0.001). Clinical relevance DL models demonstrate high accuracy in breast cancer diagnosis using MRI, showing the potential to enhance diagnostic confidence and reduce radiologist workload, especially with larger datasets minimizing overfitting and improving clinical reliability.
本研究旨在评估深度学习(DL)模型在乳腺癌MRI诊断中的性能。
在Web of Science、PubMed和IEEE Xplore上检索2015年1月至2024年2月发表的相关研究。该研究已在PROSPERO国际前瞻性系统评价注册库注册(注册号CRD42024485371)。使用诊断准确性研究质量评估-2(QUADAS2)工具和Must AI标准-10(MAIC-10)清单来评估质量和偏倚风险。荟萃分析纳入了报告DL用于乳腺癌诊断及其性能的研究,从中计算曲线下面积(AUC)、敏感性和特异性的汇总估计值。
共纳入40项研究,其中仅21项符合定量分析条件。在实施的模型中,62.5%(25/40)使用了卷积神经网络(CNN),其余37.5%(15/40)为混合复合模型(HCM)。AUC、敏感性和特异性的汇总估计值分别为0.90(95%CI:0.87,0.93)、88%(95%CI:86,91%)和90%(95%CI:87,93%)。
用于乳腺癌MRI诊断的DL模型具有高性能。然而,该分析存在相当大的固有变异性。因此,持续评估和改进DL模型对于确保其在临床环境中的实用性至关重要。
问题DL模型能否提高乳腺MRI的诊断准确性,解决研究设计和成像序列中的过拟合和异质性等挑战?发现DL在乳腺MRI中实现了高诊断准确性(AUC 0.90,敏感性88%,特异性90%),训练规模对性能指标有显著影响(p < 0.001)。临床意义DL模型在使用MRI诊断乳腺癌方面显示出高准确性,表明有可能增强诊断信心并减轻放射科医生的工作量,特别是在更大的数据集可最大限度减少过拟合并提高临床可靠性的情况下。