人工智能与临床医生在早产儿视网膜病变实际病例场景中的准确性

Accuracy of Artificial Intelligence Versus Clinicians in Real-Life Case Scenarios of Retinopathy of Prematurity.

作者信息

Belenje Akash, Pandya Dhanush, Jalali Subhadra, Rani Padmaja K

机构信息

Srimati Kanuri Santhamma Center for Vitreo-Retinal Diseases, Anant Bajaj Retina Institute, Kallam Anji Reddy Campus, L V Prasad Eye Institute, Hyderabad, IND.

出版信息

Cureus. 2025 Feb 5;17(2):e78597. doi: 10.7759/cureus.78597. eCollection 2025 Feb.

DOI:10.7759/cureus.78597

PMID:40062070

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11889417/

Abstract

Objective The aim of this study was to compare the accuracy of ChatGPT artificial intelligence (AI) with clinicians in real-life case scenarios related to retinopathy of prematurity (ROP). Methods This was a prospectively conducted study with a real-life case scenario-based questionnaire with multiple-response answers. Thirteen clinicians, including eight vitreoretinal fellowship trainees (with less than two years of experience in the management of ROP) and five ROP experts (with more than three years of experience in the management of ROP), were given 10 real-life case scenarios in ROP. The majority of responses from trainees and ROP experts were compared with the ChatGPT AI-generated responses. The ChatGPT exercise was repeated for both versions 3.5 and 4.0 more than a month apart on May 29, 2024, and July 18, 2024, to check for the majority of AI response consistency. For each real-life case scenario, the majority of clinician responses were compared with the majority of AI responses for agreement. Results ChatGPT answered nine cases correctly (90%), outperforming the fellowship trainees (77.5%, i.e., 62 correct responses out of 80). The accuracy of ROP experts was highest at 96% (i.e., 48 correct responses out of 50). There was substantial agreement between the majority of responses of clinicians and the ChatGPT responses, with a Cohen's kappa of 0.80. Conclusion The ChatGPT AI model showed substantial agreement with the majority of clinician responses and performed better than vitreoretinal fellowship trainees. ChatGPT AI presents promising new software tools that can be explored further for use in real-life case scenarios in ROP. A more accurate prompt mentioning the type of screening guidelines can promote more accurate answers by ChatGPT as per the requested guidelines.

摘要

目的本研究旨在比较ChatGPT人工智能（AI）与临床医生在早产儿视网膜病变（ROP）真实病例场景中的准确性。方法这是一项前瞻性研究，采用基于真实病例场景的多选项问卷。13名临床医生，包括8名玻璃体视网膜专科培训学员（在ROP管理方面经验少于两年）和5名ROP专家（在ROP管理方面经验超过三年），被给予10个ROP真实病例场景。将学员和ROP专家的大多数回答与ChatGPT生成的回答进行比较。于2024年5月29日和2024年7月18日，相隔一个多月对ChatGPT 3.5版和4.0版重复进行测试，以检查AI回答的一致性。对于每个真实病例场景，将临床医生的大多数回答与AI的大多数回答进行一致性比较。结果 ChatGPT正确回答了9个病例（90%），表现优于专科培训学员（77.5%，即80个回答中有62个正确）。ROP专家的准确率最高，为96%（即50个回答中有48个正确）。临床医生的大多数回答与ChatGPT的回答之间存在高度一致性，科恩kappa系数为0.80。结论 ChatGPT AI模型与临床医生的大多数回答显示出高度一致性，并且表现优于玻璃体视网膜专科培训学员。ChatGPT AI提供了有前景的新软件工具，可在ROP真实病例场景中进一步探索使用。更准确地提及筛查指南类型的提示可以促使ChatGPT根据所要求的指南给出更准确的回答。

相似文献

Accuracy of Artificial Intelligence Versus Clinicians in Real-Life Case Scenarios of Retinopathy of Prematurity.人工智能与临床医生在早产儿视网膜病变实际病例场景中的准确性

Cureus. 2025 Feb 5;17(2):e78597. doi: 10.7759/cureus.78597. eCollection 2025 Feb.

Prescription of Controlled Substances: Benefits and Risks管制药品的处方：益处与风险

Artificial Intelligence in Peripheral Artery Disease Education: A Battle Between ChatGPT and Google Gemini.外周动脉疾病教育中的人工智能：ChatGPT与谷歌Gemini的较量

Cureus. 2025 Jun 1;17(6):e85174. doi: 10.7759/cureus.85174. eCollection 2025 Jun.

Comparing Artificial Intelligence and Senior Residents in Oral Lesion Diagnosis: A Comparative Study.人工智能与住院医师在口腔病变诊断中的比较：一项对比研究。

Cureus. 2024 Jan 3;16(1):e51584. doi: 10.7759/cureus.51584. eCollection 2024 Jan.

Does ChatGPT update itself? Accuracy of ChatGPT in tympanostomy tube guidance: A comparative analysis with current literature.ChatGPT会自我更新吗？ChatGPT在鼓膜置管指导方面的准确性：与当前文献的比较分析。

Eur Arch Otorhinolaryngol. 2025 Aug 23. doi: 10.1007/s00405-025-09630-3.

Early erythropoietin for preventing red blood cell transfusion in preterm and/or low birth weight infants.早期使用促红细胞生成素预防早产和/或低出生体重儿的红细胞输血

Cochrane Database Syst Rev. 2006 Jul 19(3):CD004863. doi: 10.1002/14651858.CD004863.pub2.

Potential of ChatGPT in youth mental health emergency triage: Comparative analysis with clinicians.ChatGPT在青少年心理健康紧急分诊中的潜力：与临床医生的比较分析

PCN Rep. 2025 Jul 15;4(3):e70159. doi: 10.1002/pcn5.70159. eCollection 2025 Sep.

Navigating the future of pediatric cardiovascular surgery: Insights and innovation powered by Chat Generative Pre-Trained Transformer (ChatGPT).探索小儿心血管外科的未来：由聊天生成预训练变换器（ChatGPT）推动的见解与创新。

J Thorac Cardiovasc Surg. 2025 Feb 1. doi: 10.1016/j.jtcvs.2025.01.022.

Falls prevention interventions for community-dwelling older adults: systematic review and meta-analysis of benefits, harms, and patient values and preferences.社区居住的老年人跌倒预防干预措施：系统评价和荟萃分析的益处、危害以及患者的价值观和偏好。

Syst Rev. 2024 Nov 26;13(1):289. doi: 10.1186/s13643-024-02681-3.

[Preliminary exploration of the applications of five large language models in the field of oral auxiliary diagnosis, treatment and health consultation].五种大语言模型在口腔辅助诊断、治疗及健康咨询领域的应用初探

Zhonghua Kou Qiang Yi Xue Za Zhi. 2025 Jul 30;60(8):871-878. doi: 10.3760/cma.j.cn112144-20241107-00418.

本文引用的文献

ChatGPT Assisting Diagnosis of Neuro-Ophthalmology Diseases Based on Case Reports.基于病例报告的ChatGPT辅助诊断神经眼科疾病

J Neuroophthalmol. 2024 Oct 10;45(3):301-306. doi: 10.1097/WNO.0000000000002274.

ChatGPT Assisting Diagnosis of Neuro-Ophthalmology Diseases Based on Case Reports.基于病例报告的ChatGPT辅助神经眼科疾病诊断

J Neuroophthalmol. 2024 Oct 10. doi: 10.1097/WNO.0000000000002274.

Accuracy of an Artificial Intelligence Chatbot's Interpretation of Clinical Ophthalmic Images.人工智能聊天机器人对临床眼科图像的解读准确性。

JAMA Ophthalmol. 2024 Apr 1;142(4):321-326. doi: 10.1001/jamaophthalmol.2024.0017.

Outcome of real-time telescreening for retinopathy of prematurity using videoconferencing in a community setting in Eastern India.在印度东部社区环境中使用视频会议进行早产儿视网膜病变实时远程筛查的结果。

Indian J Ophthalmol. 2024 May 1;72(5):697-703. doi: 10.4103/IJO.IJO_2024_23. Epub 2024 Feb 23.

Recommendations for initial diabetic retinopathy screening of diabetic patients using large language model-based artificial intelligence in real-life case scenarios.在实际病例场景中使用基于大语言模型的人工智能对糖尿病患者进行糖尿病视网膜病变初始筛查的建议。

Int J Retina Vitreous. 2024 Jan 24;10(1):11. doi: 10.1186/s40942-024-00533-9.

Reliability and accuracy of artificial intelligence ChatGPT in providing information on ophthalmic diseases and management to patients.人工智能 ChatGPT 在为患者提供眼科疾病信息和管理方面的可靠性和准确性。

Eye (Lond). 2024 May;38(7):1368-1373. doi: 10.1038/s41433-023-02906-0. Epub 2024 Jan 20.

Evaluation of optical coherence tomography biomarkers to differentiate favourable and unfavourable responders to intravitreal anti-vascular endothelial growth factor treatment in retinopathy of prematurity.评估光学相干断层扫描生物标志物以区分早产儿视网膜病变患者对玻璃体内抗血管内皮生长因子治疗的有利和不利反应。

Eye (Lond). 2024 Apr;38(6):1097-1103. doi: 10.1038/s41433-023-02824-1. Epub 2023 Nov 15.

ChatGPT in ophthalmology: the dawn of a new era?眼科领域的ChatGPT：新时代的曙光？

Eye (Lond). 2024 Jan;38(1):4-7. doi: 10.1038/s41433-023-02619-4. Epub 2023 Jun 27.

Artificial intelligence for the diagnosis of retinopathy of prematurity: A systematic review of current algorithms.人工智能在早产儿视网膜病变诊断中的应用：当前算法的系统评价。

Eye (Lond). 2023 Aug;37(12):2518-2526. doi: 10.1038/s41433-022-02366-y. Epub 2022 Dec 28.

Non-contact widefield neonatal retinal imaging for retinopathy of prematurity using the Clarus 700 high resolution true colour reflectance imaging.使用 Clarus 700 高分辨率真彩反射成像系统对早产儿视网膜病变进行非接触广角新生儿视网膜成像。

Eye (Lond). 2023 Jun;37(9):1904-1909. doi: 10.1038/s41433-022-02273-2. Epub 2022 Oct 4.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验