Camacho-García-Formentí Dalia, Baylón-Vázquez Gabriela, Arriozola-Rodríguez Karen, Avalos-Ramirez Enrique, Hartleben-Matkin Curt, Valdez-Flores Hugo, Hodelin-Fuentes Damaris, Noriega Alejandro
Tech & Intelligence Department, PROSPERiA, Mexico City, Mexico.
Glaucoma Department, Instituto de Oftalmología Fundación Conde de Valenciana Institución de Asistencia Privada (IAP), Mexico City, Mexico.
Front Ophthalmol (Lausanne). 2025 May 19;5:1581212. doi: 10.3389/fopht.2025.1581212. eCollection 2025.
Artificial intelligence (AI) shows promise in ophthalmology, but its potential in tertiary care settings in Latin America remains understudied. We present a Mexican AI-powered screening tool and evaluate it against first-year ophthalmology residents in a tertiary care setting in Mexico City.
We analyzed data from 435 adult patients undergoing their first ophthalmic evaluation using an AI-based platform and first-year ophthalmology residents. The platform employs an Inception V3-based multi-output classification model with 512 × 512 input resolution to capture small lesions when detecting retinal disease. To evaluate glaucoma suspects, the system uses U-Net models that segment the optic disc and cup to calculate cup-to-disc ratio (CDR) from their vertical heights. The AI and resident evaluations were compared with expert annotations for retinal disease, CDR measurements, and glaucoma suspect classification. In addition, we evaluated a synergistic approach combining AI and resident assessments.
For glaucoma suspect classification, AI outperformed residents in accuracy (88.6% vs. 82.9%, = 0.016), sensitivity (63.0% vs. 50.0%, = 0.116), and specificity (94.5% vs. 90.5%, = 0.062). The synergistic approach achieved a higher sensitivity (80.4%) than ophthalmic residents alone or AI alone ( 0.001). AI's CDR estimates showed lower mean absolute error (0.056 vs. 0.105, 0.001) and higher correlation with expert measurements ( = 0.728 vs. = 0.538). In the retinal disease assessment, AI demonstrated higher sensitivity (90.1% vs. 63.0% for medium/high risk, 0.001) and specificity (95.8% vs. 90.4%, 0.001). Furthermore, differences between AI and residents were statistically significant across all metrics. The synergistic approach achieved the highest sensitivity for retinal disease (92.6% for medium/high risk, 100% for high risk).
AI outperformed first-year residents in key ophthalmic assessments. The synergistic use of AI and resident assessments showed potential for optimizing diagnostic accuracy, highlighting the value of AI as a supportive tool in ophthalmic practice, especially for early career clinicians.
人工智能(AI)在眼科领域展现出了前景,但在拉丁美洲的三级医疗环境中的潜力仍未得到充分研究。我们展示了一种墨西哥的人工智能驱动的筛查工具,并在墨西哥城的一家三级医疗机构中与一年级眼科住院医师进行了对比评估。
我们分析了435名成年患者的数据,这些患者使用基于人工智能的平台进行首次眼科评估,同时还有一年级眼科住院医师参与评估。该平台采用基于Inception V3的多输出分类模型,输入分辨率为512×512,用于在检测视网膜疾病时捕捉小病变。为了评估青光眼疑似病例,该系统使用U-Net模型对视盘和视杯进行分割,以根据它们的垂直高度计算杯盘比(CDR)。将人工智能和住院医师的评估结果与视网膜疾病、CDR测量以及青光眼疑似病例分类的专家注释进行比较。此外,我们评估了一种将人工智能和住院医师评估相结合的协同方法。
在青光眼疑似病例分类方面,人工智能在准确性(88.6%对82.9%,P = 0.016)、敏感性(63.0%对50.0%,P = 0.116)和特异性(94.5%对90.5%,P = 0.062)上均优于住院医师。协同方法实现了比单独的眼科住院医师或单独的人工智能更高的敏感性(80.4%)(P < 0.001)。人工智能的CDR估计显示出更低的平均绝对误差(0.056对0.105,P < 0.001)以及与专家测量更高的相关性(r = 0.728对r = 0.538)。在视网膜疾病评估中,人工智能表现出更高的敏感性(中/高风险时为90.1%对63.0%,P < 0.001)和特异性(95.8%对90.4%,P < 0.001)。此外,在所有指标上,人工智能和住院医师之间的差异具有统计学意义。协同方法在视网膜疾病方面实现了最高的敏感性(中/高风险时为92.6%,高风险时为100%)。
在关键的眼科评估中,人工智能的表现优于一年级住院医师。人工智能和住院医师评估的协同使用显示出优化诊断准确性的潜力,突出了人工智能作为眼科实践中支持工具的价值,特别是对于早期职业临床医生。