Lima Rian Vilar, Arruda Mateus Pimenta, Muniz Maria Carolina Rocha, Filho Helvécio Neves Feitosa, Ferrerira Daiane Memória Ribeiro, Pereira Samuel Montenegro
Department of Medicine, University of Fortaleza, Av. Washington Soares, 1321 - Edson Queiroz, Fortaleza - CE, Ceará, 60811-905, Brazil.
Penido Burnier Institute, São Paulo, Brazil.
Graefes Arch Clin Exp Ophthalmol. 2025 Feb;263(2):547-553. doi: 10.1007/s00417-024-06643-2. Epub 2024 Sep 18.
Artificial intelligence (AI) algorithms for the detection of retinoblastoma (RB) by fundus image analysis have been proposed as a potentially effective technique to facilitate diagnosis and screening programs. However, doubts remain about the accuracy of the technique, the best type of AI for this situation, and its feasibility for everyday use. Therefore, we performed a systematic review and meta-analysis to evaluate this issue.
Following PRISMA 2020 guidelines, a comprehensive search of MEDLINE, Embase, ClinicalTrials.gov and IEEEX databases identified 494 studies whose titles and abstracts were screened for eligibility. We included diagnostic studies that evaluated the accuracy of AI in identifying retinoblastoma based on fundus imaging. Univariate and bivariate analysis was performed using the random effects model. The study protocol was registered in PROSPERO under CRD42024499221.
Six studies with 9902 fundus images were included, of which 5944 (60%) had confirmed RB. Only one dataset used a semi-supervised machine learning (ML) based method, all other studies used supervised ML, three using architectures requiring high computational power and two using more economical models. The pooled analysis of all models showed a sensitivity of 98.2% (95% CI: 0.947-0.994), a specificity of 98.5% (95% CI: 0.916-0.998) and an AUC of 0.986 (95% CI: 0.970-0.989). Subgroup analyses comparing models with high and low computational power showed no significant difference (p=0.824).
AI methods showed a high precision in the diagnosis of RB based on fundus images with no significant difference when comparing high and low computational power models, suggesting a viability of their use. Validation and cost-effectiveness studies are needed in different income countries. Subpopulations should also be analyzed, as AI may be useful as an initial screening tool in populations at high risk for RB, serving as a bridge to the pediatric ophthalmologist or ocular oncologist, who are scarce globally.
What is known Retinoblastoma is the most common intraocular cancer in childhood and diagnostic delay is the main factor leading to a poor prognosis. The application of machine learning techniques proposes reliable methods for screening and diagnosis of retinal diseases. What is new The meta-analysis of the diagnostic accuracy of artificial intelligence methods for diagnosing retinoblastoma based on fundus images showed a sensitivity of 98.2% (95% CI: 0.947-0.994) and a specificity of 98.5% (95% CI: 0.916-0.998). There was no statistically significant difference in the diagnostic accuracy of high and low computational power models. The overall performance of supervised machine learning was best than unsupervised, although few studies were available on the second type.
通过眼底图像分析检测视网膜母细胞瘤(RB)的人工智能(AI)算法已被提出,作为一种潜在有效的技术,以促进诊断和筛查项目。然而,对于该技术的准确性、适用于这种情况的最佳AI类型及其日常使用的可行性仍存在疑问。因此,我们进行了一项系统评价和荟萃分析来评估这个问题。
按照PRISMA 2020指南,对MEDLINE、Embase、ClinicalTrials.gov和IEEEX数据库进行全面检索,共识别出494项研究,对其标题和摘要进行筛选以确定是否符合纳入标准。我们纳入了评估基于眼底成像的AI识别视网膜母细胞瘤准确性的诊断研究。使用随机效应模型进行单变量和双变量分析。该研究方案已在PROSPERO中注册,注册号为CRD42024499221。
纳入了6项研究,共9902张眼底图像,其中5944张(60%)已确诊为RB。只有一个数据集使用了基于半监督机器学习(ML)的方法,所有其他研究都使用了监督ML,其中三项使用了需要高计算能力的架构,两项使用了更经济的模型。对所有模型的汇总分析显示,敏感性为98.2%(95%CI:0.947 - 0.994),特异性为98.5%(95%CI:0.916 - 0.998),AUC为0.986(95%CI:0.970 - 0.989)。比较高计算能力和低计算能力模型的亚组分析显示无显著差异(p = 0.824)。
AI方法在基于眼底图像诊断RB方面显示出高精度,比较高计算能力和低计算能力模型时无显著差异,表明其具有使用的可行性。不同收入国家需要进行验证和成本效益研究。还应分析亚人群,因为AI可能作为RB高危人群的初始筛查工具很有用,可作为通向全球稀缺的儿科眼科医生或眼科肿瘤学家的桥梁。
已知信息 视网膜母细胞瘤是儿童最常见的眼内癌,诊断延迟是导致预后不良的主要因素。机器学习技术的应用为视网膜疾病的筛查和诊断提供了可靠方法。新发现 基于眼底图像诊断视网膜母细胞瘤的人工智能方法的诊断准确性荟萃分析显示,敏感性为98.2%(95%CI:0.947 - 0.994),特异性为98.5%(95%CI:0.916 - 0.998)。高计算能力和低计算能力模型的诊断准确性无统计学显著差异。监督机器学习的总体性能优于无监督学习,尽管关于后者的研究较少。