Suppr超能文献

机器学习算法在糖尿病视网膜病变筛查中的性能和局限性:荟萃分析。

Performance and Limitation of Machine Learning Algorithms for Diabetic Retinopathy Screening: Meta-analysis.

机构信息

Shiley Eye Institute and Viterbi Family Department of Ophthalmology, University of California San Diego, La Jolla, CA, United States.

Retina Division, Wilmer Eye Institute, The Johns Hopkins Medicine, Baltimore, MD, United States.

出版信息

J Med Internet Res. 2021 Jul 3;23(7):e23863. doi: 10.2196/23863.

Abstract

BACKGROUND

Diabetic retinopathy (DR), whose standard diagnosis is performed by human experts, has high prevalence and requires a more efficient screening method. Although machine learning (ML)-based automated DR diagnosis has gained attention due to recent approval of IDx-DR, performance of this tool has not been examined systematically, and the best ML technique for use in a real-world setting has not been discussed.

OBJECTIVE

The aim of this study was to systematically examine the overall diagnostic accuracy of ML in diagnosing DR of different categories based on color fundus photographs and to determine the state-of-the-art ML approach.

METHODS

Published studies in PubMed and EMBASE were searched from inception to June 2020. Studies were screened for relevant outcomes, publication types, and data sufficiency, and a total of 60 out of 2128 (2.82%) studies were retrieved after study selection. Extraction of data was performed by 2 authors according to PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses), and the quality assessment was performed according to the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2). Meta-analysis of diagnostic accuracy was pooled using a bivariate random effects model. The main outcomes included diagnostic accuracy, sensitivity, and specificity of ML in diagnosing DR based on color fundus photographs, as well as the performances of different major types of ML algorithms.

RESULTS

The primary meta-analysis included 60 color fundus photograph studies (445,175 interpretations). Overall, ML demonstrated high accuracy in diagnosing DR of various categories, with a pooled area under the receiver operating characteristic (AUROC) ranging from 0.97 (95% CI 0.96-0.99) to 0.99 (95% CI 0.98-1.00). The performance of ML in detecting more-than-mild DR was robust (sensitivity 0.95; AUROC 0.97), and by subgroup analyses, we observed that robust performance of ML was not limited to benchmark data sets (sensitivity 0.92; AUROC 0.96) but could be generalized to images collected in clinical practice (sensitivity 0.97; AUROC 0.97). Neural network was the most widely used method, and the subgroup analysis revealed a pooled AUROC of 0.98 (95% CI 0.96-0.99) for studies that used neural networks to diagnose more-than-mild DR.

CONCLUSIONS

This meta-analysis demonstrated high diagnostic accuracy of ML algorithms in detecting DR on color fundus photographs, suggesting that state-of-the-art, ML-based DR screening algorithms are likely ready for clinical applications. However, a significant portion of the earlier published studies had methodology flaws, such as the lack of external validation and presence of spectrum bias. The results of these studies should be interpreted with caution.

摘要

背景

糖尿病视网膜病变(DR)的标准诊断由人类专家进行,其患病率较高,因此需要更有效的筛查方法。尽管基于机器学习(ML)的自动 DR 诊断技术因 IDx-DR 的最近批准而受到关注,但该工具的性能尚未得到系统检查,也未讨论在实际环境中使用的最佳 ML 技术。

目的

本研究旨在系统评估基于彩色眼底照片的 ML 诊断不同类别 DR 的总体诊断准确性,并确定最先进的 ML 方法。

方法

从开始到 2020 年 6 月,在 PubMed 和 EMBASE 中搜索已发表的研究。研究根据相关结果、出版物类型和数据充足性进行筛选,经过研究选择后,共检索到 2128 篇研究中的 60 篇(2.82%)。根据 PRISMA(系统评价和荟萃分析的首选报告项目)提取数据,根据 QUADAS-2(诊断准确性研究的质量评估 2)进行质量评估。使用双变量随机效应模型对诊断准确性进行荟萃分析。主要结果包括基于彩色眼底照片的 ML 诊断 DR 的准确性、敏感性和特异性,以及不同主要类型的 ML 算法的性能。

结果

主要的荟萃分析包括 60 项彩色眼底照片研究(445175 次解释)。总体而言,ML 在诊断各种类别的 DR 方面具有较高的准确性,汇总的受试者工作特征(ROC)曲线下面积(AUROC)范围为 0.97(95%CI 0.96-0.99)至 0.99(95%CI 0.98-1.00)。ML 在检测重度以上 DR 方面的性能稳健(敏感性 0.95;AUROC 0.97),并且通过亚组分析,我们观察到 ML 的稳健性能不仅限于基准数据集(敏感性 0.92;AUROC 0.96),而且可以推广到临床实践中采集的图像(敏感性 0.97;AUROC 0.97)。神经网络是使用最广泛的方法,亚组分析显示,使用神经网络诊断重度以上 DR 的研究的汇总 AUROC 为 0.98(95%CI 0.96-0.99)。

结论

本荟萃分析表明,ML 算法在检测彩色眼底照片中的 DR 方面具有较高的诊断准确性,这表明最先进的基于 ML 的 DR 筛查算法可能已准备好用于临床应用。然而,早期发表的研究中有相当一部分存在方法学缺陷,例如缺乏外部验证和存在光谱偏差。这些研究的结果应谨慎解释。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/24a3/8406115/d8a0820e9531/jmir_v23i7e23863_fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验