Department of Ophthalmology, Bascom Palmer Eye Institute, University of Miami Miller School of Medicine, Miami, Florida; and.
Wills Eye Hospital, Thomas Jefferson University, Philadelphia, Pennsylvania.
Retina. 2024 Oct 1;44(10):1732-1740. doi: 10.1097/IAE.0000000000004204.
This study evaluates a large language model, Generative Pre-trained Transformer 4 with vision, for diagnosing vitreoretinal diseases in real-world ophthalmology settings.
A retrospective cross-sectional study at Bascom Palmer Eye Clinic, analyzing patient data from January 2010 to March 2023, assesses Generative Pre-trained Transformer 4 with vision's performance on retinal image analysis and International Classification of Diseases 10th revision coding across 2 patient groups: simpler cases (Group A) and complex cases (Group B) requiring more in-depth analysis. Diagnostic accuracy was assessed through open-ended questions and multiple-choice questions independently verified by three retina specialists.
In 256 eyes from 143 patients, Generative Pre-trained Transformer 4-V demonstrated a 13.7% accuracy for open-ended questions and 31.3% for multiple-choice questions, with International Classification of Diseases 10th revision code accuracies at 5.5% and 31.3%, respectively. Accurately diagnosed posterior vitreous detachment, nonexudative age-related macular degeneration, and retinal detachment. International Classification of Diseases 10th revision coding was most accurate for nonexudative age-related macular degeneration, central retinal vein occlusion, and macular hole in OEQs, and for posterior vitreous detachment, nonexudative age-related macular degeneration, and retinal detachment in multiple-choice questions. No significant difference in diagnostic or coding accuracy was found in Groups A and B.
Generative Pre-trained Transformer 4 with vision has potential in clinical care and record keeping, particularly with standardized questions. Its effectiveness in open-ended scenarios is limited, indicating a significant limitation in providing complex medical advice.
本研究评估了一种大型语言模型,带有视觉功能的生成式预训练转换器 4,用于在真实世界的眼科环境中诊断玻璃体视网膜疾病。
这是一项在巴斯科姆帕尔默眼科诊所进行的回顾性横断面研究,分析了 2010 年 1 月至 2023 年 3 月期间的患者数据,评估了带有视觉功能的生成式预训练转换器 4 在视网膜图像分析和国际疾病分类第 10 版编码方面的表现,涉及两组患者:简单病例(A 组)和需要更深入分析的复杂病例(B 组)。诊断准确性通过开放性问题和多项选择题进行评估,由三位视网膜专家独立验证。
在 143 名患者的 256 只眼中,Generative Pre-trained Transformer 4-V 在开放性问题中的准确率为 13.7%,在多项选择题中的准确率为 31.3%,国际疾病分类第 10 版编码的准确率分别为 5.5%和 31.3%。准确诊断了后玻璃体脱离、非渗出性年龄相关性黄斑变性和视网膜脱离。国际疾病分类第 10 版编码在开放性问题中对非渗出性年龄相关性黄斑变性、视网膜中央静脉阻塞和黄斑裂孔的准确率最高,在多项选择题中对后玻璃体脱离、非渗出性年龄相关性黄斑变性和视网膜脱离的准确率最高。在 A 组和 B 组中,诊断或编码准确性没有显著差异。
带有视觉功能的生成式预训练转换器 4 在临床护理和记录保存方面具有潜力,特别是在使用标准化问题的情况下。它在开放性场景中的有效性有限,表明其在提供复杂医疗建议方面存在重大限制。