Central South University of Forestry and Technology, Changsha, Hunan, China; State Key Laboratory of Pathogenesis, Prevention and Treatment of High Incidence Diseases in Central Asia, Clinical Medical Research Institute, The First Affiliated Hospital of Xinjiang Medical University, Urumqi, Xinjiang, China.
Central South University of Forestry and Technology, Changsha, Hunan, China.
Ophthalmol Retina. 2024 Jul;8(7):678-687. doi: 10.1016/j.oret.2024.01.013. Epub 2024 Jan 17.
To evaluate the performance of machine learning (ML) in the diagnosis of retinopathy of prematurity (ROP) and to assess whether it can be an effective automated diagnostic tool for clinical applications.
Early detection of ROP is crucial for preventing tractional retinal detachment and blindness in preterm infants, which has significant clinical relevance.
Web of Science, PubMed, Embase, IEEE Xplore, and Cochrane Library were searched for published studies on image-based ML for diagnosis of ROP or classification of clinical subtypes from inception to October 1, 2022. The quality assessment tool for artificial intelligence-centered diagnostic test accuracy studies was used to determine the risk of bias (RoB) of the included original studies. A bivariate mixed effects model was used for quantitative analysis of the data, and the Deek's test was used for calculating publication bias. Quality of evidence was assessed using Grading of Recommendations Assessment, Development and Evaluation.
Twenty-two studies were included in the systematic review; 4 studies had high or unclear RoB. In the area of indicator test items, only 2 studies had high or unclear RoB because they did not establish predefined thresholds. In the area of reference standards, 3 studies had high or unclear RoB. Regarding applicability, only 1 study was considered to have high or unclear applicability in terms of patient selection. The sensitivity and specificity of image-based ML for the diagnosis of ROP were 93% (95% confidence interval [CI]: 0.90-0.94) and 95% (95% CI: 0.94-0.97), respectively. The area under the receiver operating characteristic curve (AUC) was 0.98 (95% CI: 0.97-0.99). For the classification of clinical subtypes of ROP, the sensitivity and specificity were 93% (95% CI: 0.89-0.96) and 93% (95% CI: 0.89-0.95), respectively, and the AUC was 0.97 (95% CI: 0.96-0.98). The classification results were highly similar to those of clinical experts (Spearman's R = 0.879).
Machine learning algorithms are no less accurate than human experts and hold considerable potential as automated diagnostic tools for ROP. However, given the quality and high heterogeneity of the available evidence, these algorithms should be considered as supplementary tools to assist clinicians in diagnosing ROP.
FINANCIAL DISCLOSURE(S): Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
主题:评估机器学习(ML)在早产儿视网膜病变(ROP)诊断中的性能,并评估其是否可作为临床应用的有效自动诊断工具。
临床相关性:ROP 的早期检测对于预防早产儿牵引性视网膜脱离和失明至关重要,具有重要的临床相关性。
方法:从建库至 2022 年 10 月 1 日,检索 Web of Science、PubMed、Embase、IEEE Xplore 和 Cochrane Library 中关于基于图像的 ML 用于 ROP 诊断或临床亚型分类的研究。使用人工智能为中心的诊断测试准确性研究质量评估工具来确定纳入原始研究的偏倚风险(RoB)。使用双变量混合效应模型对数据进行定量分析,并使用 Deek's 检验计算发表偏倚。使用推荐评估、制定与评估分级法评估证据质量。
结果:系统评价纳入 22 项研究;4 项研究的 RoB 为高或不明确。在指标检测项目方面,只有 2 项研究的 RoB 为高或不明确,因为它们没有建立预设的阈值。在参考标准方面,有 3 项研究的 RoB 为高或不明确。在适用性方面,只有 1 项研究在患者选择方面被认为具有高或不明确的适用性。基于图像的 ML 诊断 ROP 的敏感性和特异性分别为 93%(95%置信区间[CI]:0.90-0.94)和 95%(95% CI:0.94-0.97)。受试者工作特征曲线下面积(AUC)为 0.98(95% CI:0.97-0.99)。对于 ROP 临床亚型的分类,敏感性和特异性分别为 93%(95% CI:0.89-0.96)和 93%(95% CI:0.89-0.95),AUC 为 0.97(95% CI:0.96-0.98)。分类结果与临床专家的结果高度相似(Spearman's R=0.879)。
结论:机器学习算法与人类专家一样准确,具有作为 ROP 自动诊断工具的巨大潜力。然而,鉴于现有证据的质量和高度异质性,这些算法应被视为辅助临床医生诊断 ROP 的补充工具。
金融披露:本文结尾处的脚注和披露中可能包含专有或商业披露。