深度学习在医学成像中的诊断准确性：一项系统评价与荟萃分析。

Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis.

作者信息

Aggarwal Ravi, Sounderajah Viknesh, Martin Guy, Ting Daniel S W, Karthikesalingam Alan, King Dominic, Ashrafian Hutan, Darzi Ara

机构信息

Institute of Global Health Innovation, Imperial College London, London, UK.

Singapore Eye Research Institute, Singapore National Eye Center, Singapore, Singapore.

出版信息

NPJ Digit Med. 2021 Apr 7;4(1):65. doi: 10.1038/s41746-021-00438-z.

DOI:10.1038/s41746-021-00438-z

PMID:33828217

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8027892/

Abstract

Deep learning (DL) has the potential to transform medical diagnostics. However, the diagnostic accuracy of DL is uncertain. Our aim was to evaluate the diagnostic accuracy of DL algorithms to identify pathology in medical imaging. Searches were conducted in Medline and EMBASE up to January 2020. We identified 11,921 studies, of which 503 were included in the systematic review. Eighty-two studies in ophthalmology, 82 in breast disease and 115 in respiratory disease were included for meta-analysis. Two hundred twenty-four studies in other specialities were included for qualitative review. Peer-reviewed studies that reported on the diagnostic accuracy of DL algorithms to identify pathology using medical imaging were included. Primary outcomes were measures of diagnostic accuracy, study design and reporting standards in the literature. Estimates were pooled using random-effects meta-analysis. In ophthalmology, AUC's ranged between 0.933 and 1 for diagnosing diabetic retinopathy, age-related macular degeneration and glaucoma on retinal fundus photographs and optical coherence tomography. In respiratory imaging, AUC's ranged between 0.864 and 0.937 for diagnosing lung nodules or lung cancer on chest X-ray or CT scan. For breast imaging, AUC's ranged between 0.868 and 0.909 for diagnosing breast cancer on mammogram, ultrasound, MRI and digital breast tomosynthesis. Heterogeneity was high between studies and extensive variation in methodology, terminology and outcome measures was noted. This can lead to an overestimation of the diagnostic accuracy of DL algorithms on medical imaging. There is an immediate need for the development of artificial intelligence-specific EQUATOR guidelines, particularly STARD, in order to provide guidance around key issues in this field.

摘要

深度学习（DL）有潜力改变医学诊断。然而，DL的诊断准确性尚不确定。我们的目的是评估DL算法在医学影像中识别病变的诊断准确性。截至2020年1月，我们在Medline和EMBASE中进行了检索。我们识别出11921项研究，其中503项纳入了系统评价。82项眼科研究、82项乳腺疾病研究和115项呼吸系统疾病研究纳入了荟萃分析。224项其他专业的研究纳入了定性评价。纳入了报告DL算法使用医学影像识别病变的诊断准确性的同行评审研究。主要结局是文献中诊断准确性的测量指标、研究设计和报告标准。使用随机效应荟萃分析汇总估计值。在眼科，在视网膜眼底照片和光学相干断层扫描上诊断糖尿病视网膜病变、年龄相关性黄斑变性和青光眼时，AUC范围在0.933至1之间。在呼吸系统影像中，在胸部X线或CT扫描上诊断肺结节或肺癌时，AUC范围在0.864至0.937之间。对于乳腺影像，在乳房X线摄影、超声、MRI和数字乳腺断层合成上诊断乳腺癌时，AUC范围在0.868至0.909之间。研究间异质性较高，且注意到方法、术语和结局测量存在广泛差异。这可能导致对医学影像上DL算法诊断准确性的高估。迫切需要制定针对人工智能的EQUATOR指南，尤其是STARD，以便为该领域的关键问题提供指导。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/04a4/8027892/63c890eeacdc/41746_2021_438_Fig1_HTML.jpg

相似文献

Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis.深度学习在医学成像中的诊断准确性：一项系统评价与荟萃分析。

NPJ Digit Med. 2021 Apr 7;4(1):65. doi: 10.1038/s41746-021-00438-z.

Artificial intelligence for diagnosing exudative age-related macular degeneration.人工智能在渗出性年龄相关性黄斑变性诊断中的应用。

Cochrane Database Syst Rev. 2024 Oct 17;10(10):CD015522. doi: 10.1002/14651858.CD015522.pub2.

Deep learning algorithms for detection of diabetic retinopathy in retinal fundus photographs: A systematic review and meta-analysis.深度学习算法在眼底视网膜照片糖尿病性视网膜病变检测中的应用：系统评价和荟萃分析。

Comput Methods Programs Biomed. 2020 Jul;191:105320. doi: 10.1016/j.cmpb.2020.105320. Epub 2020 Jan 16.

Thoracic imaging tests for the diagnosis of COVID-19.用于诊断新型冠状病毒肺炎的胸部影像学检查

Cochrane Database Syst Rev. 2020 Nov 26;11:CD013639. doi: 10.1002/14651858.CD013639.pub3.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Software with artificial intelligence-derived algorithms for detecting and analysing lung nodules in CT scans: systematic review and economic evaluation.用于在CT扫描中检测和分析肺结节的具有人工智能衍生算法的软件：系统评价和经济评估

Health Technol Assess. 2025 May;29(14):1-234. doi: 10.3310/JYTW8921.

Deep Learning in Glaucoma Detection and Progression Prediction: A Systematic Review and Meta-Analysis.青光眼检测与病情进展预测中的深度学习：一项系统综述与荟萃分析

Biomedicines. 2025 Feb 10;13(2):420. doi: 10.3390/biomedicines13020420.

Thoracic imaging tests for the diagnosis of COVID-19.用于 COVID-19 诊断的胸部影像学检查。

Cochrane Database Syst Rev. 2022 May 16;5(5):CD013639. doi: 10.1002/14651858.CD013639.pub5.

Optical coherence tomography (OCT) for detection of macular oedema in patients with diabetic retinopathy.光学相干断层扫描（OCT）用于检测糖尿病视网膜病变患者的黄斑水肿。

Cochrane Database Syst Rev. 2015 Jan 7;1(1):CD008081. doi: 10.1002/14651858.CD008081.pub3.

Optical coherence tomography (OCT) for detection of macular oedema in patients with diabetic retinopathy.光学相干断层扫描（OCT）用于检测糖尿病视网膜病变患者的黄斑水肿。

Cochrane Database Syst Rev. 2011 Jul 6(7):CD008081. doi: 10.1002/14651858.CD008081.pub2.

引用本文的文献

Utilizing Detectron2 for accurate and efficient colon cancer detection in histopathological images.利用Detectron2在组织病理学图像中进行准确高效的结肠癌检测。

Front Bioeng Biotechnol. 2025 Aug 22;13:1593534. doi: 10.3389/fbioe.2025.1593534. eCollection 2025.

Current trends and future prospects of language models and processing systems in spine surgery - a scoping review.脊柱手术中语言模型和处理系统的当前趋势与未来前景——一项范围综述

Neurosurg Rev. 2025 Sep 5;48(1):633. doi: 10.1007/s10143-025-03785-7.

Dynamic-Attentive Pooling Networks: A Hybrid Lightweight Deep Model for Lung Cancer Classification.动态注意力池化网络：一种用于肺癌分类的混合轻量级深度模型

J Imaging. 2025 Aug 21;11(8):283. doi: 10.3390/jimaging11080283.

Assessing ResNeXt and RegNet Models for Diabetic Retinopathy Classification: A Comprehensive Comparative Study.评估用于糖尿病视网膜病变分类的ResNeXt和RegNet模型：一项全面的比较研究。

Diagnostics (Basel). 2025 Aug 5;15(15):1966. doi: 10.3390/diagnostics15151966.

Novel imaging diagnosis of neuropsychiatric systemic lupus erythematosus using topological data analysis: A retrospective study.基于拓扑数据分析的神经精神性系统性红斑狼疮新型影像诊断：一项回顾性研究

PLoS One. 2025 Aug 13;20(8):e0329859. doi: 10.1371/journal.pone.0329859. eCollection 2025.

Visceral Arterial Pseudoaneurysms-A Clinical Review.内脏动脉假性动脉瘤——临床综述

Medicina (Kaunas). 2025 Jul 21;61(7):1312. doi: 10.3390/medicina61071312.

Exploring the Role of Artificial Intelligence in Smart Healthcare: A Capability and Function-Oriented Review.探索人工智能在智能医疗中的作用：一项基于能力和功能的综述。

Healthcare (Basel). 2025 Jul 8;13(14):1642. doi: 10.3390/healthcare13141642.

Artificial intelligence in pancreatic intraductal papillary mucinous neoplasm imaging: A systematic review.人工智能在胰腺导管内乳头状黏液性肿瘤成像中的应用：一项系统综述。

PLOS Digit Health. 2025 Jul 23;4(7):e0000920. doi: 10.1371/journal.pdig.0000920. eCollection 2025 Jul.

An end-to-end interpretable machine-learning-based framework for early-stage diagnosis of gallbladder cancer using multi-modality medical data.一种基于机器学习的端到端可解释框架，用于利用多模态医学数据对胆囊癌进行早期诊断。

BMC Cancer. 2025 Jul 16;25(1):1178. doi: 10.1186/s12885-025-14462-9.

Auditor Models to Suppress Poor AI Predictions Can Improve Human-AI Collaborative Performance.抑制不良人工智能预测的审计模型可提高人机协作性能。

medRxiv. 2025 Jun 24:2025.06.24.25330212. doi: 10.1101/2025.06.24.25330212.

本文引用的文献

A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis.深度学习在医学影像疾病检测方面的性能与医疗保健专业人员的比较：系统评价和荟萃分析。

Lancet Digit Health. 2019 Oct;1(6):e271-e297. doi: 10.1016/S2589-7500(19)30123-2. Epub 2019 Sep 25.

Artificial intelligence using deep learning to screen for referable and vision-threatening diabetic retinopathy in Africa: a clinical validation study.人工智能利用深度学习在非洲筛查可转诊和威胁视力的糖尿病视网膜病变：一项临床验证研究。

Lancet Digit Health. 2019 May;1(1):e35-e44. doi: 10.1016/S2589-7500(19)30004-4. Epub 2019 May 2.

The state of artificial intelligence-based FDA-approved medical devices and algorithms: an online database.基于人工智能且获美国食品药品监督管理局批准的医疗设备及算法的现状：一个在线数据库。

NPJ Digit Med. 2020 Sep 11;3:118. doi: 10.1038/s41746-020-00324-0. eCollection 2020.

Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extension.涉及人工智能干预的临床试验方案指南：SPIRIT-AI 扩展。

Nat Med. 2020 Sep;26(9):1351-1363. doi: 10.1038/s41591-020-1037-7. Epub 2020 Sep 9.

Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension.临床试验报告报告指南涉及人工智能的干预措施：CONSORT-AI 扩展。

Nat Med. 2020 Sep;26(9):1364-1374. doi: 10.1038/s41591-020-1034-x. Epub 2020 Sep 9.

A Deep Learning-Based Algorithm Identifies Glaucomatous Discs Using Monoscopic Fundus Photographs.基于深度学习的算法可利用单目眼底照片识别青光眼性视盘。

Ophthalmol Glaucoma. 2018 Jul-Aug;1(1):15-22. doi: 10.1016/j.ogla.2018.04.002. Epub 2018 Jun 5.

Validation of a Deep Learning Model to Screen for Glaucoma Using Images from Different Fundus Cameras and Data Augmentation.基于不同眼底相机图像和数据增强的深度学习模型筛查青光眼的验证。

Ophthalmol Glaucoma. 2019 Jul-Aug;2(4):224-231. doi: 10.1016/j.ogla.2019.03.008. Epub 2019 Apr 1.

Developing specific reporting guidelines for diagnostic accuracy studies assessing AI interventions: The STARD-AI Steering Group.为评估人工智能干预措施的诊断准确性研究制定特定报告指南：STARD-AI指导小组。

Nat Med. 2020 Jun;26(6):807-808. doi: 10.1038/s41591-020-0941-1.

Reliability of Supervised Machine Learning Using Synthetic Data in Health Care: Model to Preserve Privacy for Data Sharing.医疗保健中使用合成数据的监督式机器学习的可靠性：用于数据共享时保护隐私的模型

JMIR Med Inform. 2020 Jul 20;8(7):e18910. doi: 10.2196/18910.

Machine learning and artificial intelligence research for patient benefit: 20 critical questions on transparency, replicability, ethics, and effectiveness.机器学习和人工智能研究如何造福患者：透明度、可重复性、伦理和有效性方面的 20 个关键问题。

BMJ. 2020 Mar 20;368:l6927. doi: 10.1136/bmj.l6927.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

深度学习在医学成像中的诊断准确性：一项系统评价与荟萃分析。

Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献