Eight-year Program of Clinical Medicine, Peking Union Medical College Hospital (PUMCH), Chinese Academe of Medical Sciences & Peking Union Medical College (CAMS & PUMC), Beijing, China.
Medical Research Center, PUMCH, CAMS & PUMC, Beijing, China.
Eur J Endocrinol. 2020 Jun 1;183(1):41-49. doi: 10.1530/EJE-19-0968.
Automatic diabetic retinopathy screening system based on neural networks has been used to detect diabetic retinopathy (DR). However, there is no quantitative synthesis of performance of these methods. We aimed to estimate the sensitivity and specificity of neural networks in DR grading.
Medline, Embase, IEEE Xplore, and Cochrane Library were searched up to 23 July 2019. Studies that evaluated performance of neural networks in detection of moderate or worse DR or diabetic macular edema using retinal fundus images with ophthalmologists' judgment as reference standard were included. Two reviewers extracted data independently. Risk of bias of eligible studies was assessed using QUDAS-2 tool.
Twenty-four studies involving 235 235 subjects were included. Quantitative random-effects meta-analysis using the Rutter and Gatsonis hierarchical summary receiver operating characteristics (HSROC) model revealed a pooled sensitivity of 91.9% (95% CI: 89.6% to 94.3%) and specificity of 91.3% (95% CI: 89.0% to 93.5%). Subgroup analyses and meta-regression did not provide any statistically significant findings for the heterogeneous diagnostic accuracy in studies with different image resolutions, sample sizes of training sets, architecture of convolutional neural networks, or diagnostic criteria.
State-of-the-art neural networks could effectively detect clinical significant DR. To further improve diagnostic accuracy of neural networks, researchers might need to develop new algorithms rather than simply enlarge sample sizes of training sets or optimize image quality.
基于神经网络的自动糖尿病视网膜病变筛查系统已被用于检测糖尿病视网膜病变(DR)。然而,这些方法的性能尚无定量综合。我们旨在评估神经网络在 DR 分级中的敏感性和特异性。
检索了 Medline、Embase、IEEE Xplore 和 Cochrane Library,截至 2019 年 7 月 23 日。纳入了使用眼底图像评估神经网络在检测中度或更严重 DR 或糖尿病性黄斑水肿方面性能的研究,这些研究以眼科医生的判断作为参考标准。两名审查员独立提取数据。使用 QUADAS-2 工具评估合格研究的偏倚风险。
共纳入 24 项研究,涉及 235235 例受试者。使用 Rutter 和 Gatsonis 分层汇总受试者工作特征(HSROC)模型的定量随机效应荟萃分析显示,汇总敏感性为 91.9%(95%CI:89.6%至 94.3%),特异性为 91.3%(95%CI:89.0%至 93.5%)。亚组分析和荟萃回归未发现图像分辨率不同、训练集样本量、卷积神经网络结构或诊断标准不同的研究中诊断准确性存在统计学意义的结果。
最先进的神经网络可以有效检测临床显著的 DR。为了进一步提高神经网络的诊断准确性,研究人员可能需要开发新的算法,而不仅仅是简单地增加训练集的样本量或优化图像质量。