Suppr超能文献

深度学习在医学成像中的诊断准确性:一项系统评价与荟萃分析。

Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis.

作者信息

Aggarwal Ravi, Sounderajah Viknesh, Martin Guy, Ting Daniel S W, Karthikesalingam Alan, King Dominic, Ashrafian Hutan, Darzi Ara

机构信息

Institute of Global Health Innovation, Imperial College London, London, UK.

Singapore Eye Research Institute, Singapore National Eye Center, Singapore, Singapore.

出版信息

NPJ Digit Med. 2021 Apr 7;4(1):65. doi: 10.1038/s41746-021-00438-z.

Abstract

Deep learning (DL) has the potential to transform medical diagnostics. However, the diagnostic accuracy of DL is uncertain. Our aim was to evaluate the diagnostic accuracy of DL algorithms to identify pathology in medical imaging. Searches were conducted in Medline and EMBASE up to January 2020. We identified 11,921 studies, of which 503 were included in the systematic review. Eighty-two studies in ophthalmology, 82 in breast disease and 115 in respiratory disease were included for meta-analysis. Two hundred twenty-four studies in other specialities were included for qualitative review. Peer-reviewed studies that reported on the diagnostic accuracy of DL algorithms to identify pathology using medical imaging were included. Primary outcomes were measures of diagnostic accuracy, study design and reporting standards in the literature. Estimates were pooled using random-effects meta-analysis. In ophthalmology, AUC's ranged between 0.933 and 1 for diagnosing diabetic retinopathy, age-related macular degeneration and glaucoma on retinal fundus photographs and optical coherence tomography. In respiratory imaging, AUC's ranged between 0.864 and 0.937 for diagnosing lung nodules or lung cancer on chest X-ray or CT scan. For breast imaging, AUC's ranged between 0.868 and 0.909 for diagnosing breast cancer on mammogram, ultrasound, MRI and digital breast tomosynthesis. Heterogeneity was high between studies and extensive variation in methodology, terminology and outcome measures was noted. This can lead to an overestimation of the diagnostic accuracy of DL algorithms on medical imaging. There is an immediate need for the development of artificial intelligence-specific EQUATOR guidelines, particularly STARD, in order to provide guidance around key issues in this field.

摘要

深度学习(DL)有潜力改变医学诊断。然而,DL的诊断准确性尚不确定。我们的目的是评估DL算法在医学影像中识别病变的诊断准确性。截至2020年1月,我们在Medline和EMBASE中进行了检索。我们识别出11921项研究,其中503项纳入了系统评价。82项眼科研究、82项乳腺疾病研究和115项呼吸系统疾病研究纳入了荟萃分析。224项其他专业的研究纳入了定性评价。纳入了报告DL算法使用医学影像识别病变的诊断准确性的同行评审研究。主要结局是文献中诊断准确性的测量指标、研究设计和报告标准。使用随机效应荟萃分析汇总估计值。在眼科,在视网膜眼底照片和光学相干断层扫描上诊断糖尿病视网膜病变、年龄相关性黄斑变性和青光眼时,AUC范围在0.933至1之间。在呼吸系统影像中,在胸部X线或CT扫描上诊断肺结节或肺癌时,AUC范围在0.864至0.937之间。对于乳腺影像,在乳房X线摄影、超声、MRI和数字乳腺断层合成上诊断乳腺癌时,AUC范围在0.868至0.909之间。研究间异质性较高,且注意到方法、术语和结局测量存在广泛差异。这可能导致对医学影像上DL算法诊断准确性的高估。迫切需要制定针对人工智能的EQUATOR指南,尤其是STARD,以便为该领域的关键问题提供指导。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/04a4/8027892/63c890eeacdc/41746_2021_438_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验