• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估眼科领域的ChatGPT-4 Plus:图像识别和特定领域预训练对诊断性能的影响。

Evaluating ChatGPT-4 Plus in Ophthalmology: Effect of Image Recognition and Domain-Specific Pretraining on Diagnostic Performance.

作者信息

Wu Kevin Y, Qian Shu Yu, Marchand Michael

机构信息

Department of Surgery, Division of Ophthalmology, University of Sherbrooke, Sherbrooke, QC J1G 2E8, Canada.

Faculty of Medicine, University of Sherbrooke, Sherbrooke, QC J1G 2E8, Canada.

出版信息

Diagnostics (Basel). 2025 Jul 19;15(14):1820. doi: 10.3390/diagnostics15141820.

DOI:10.3390/diagnostics15141820
PMID:40722569
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12293433/
Abstract

: In recent years, the rapid advancements in artificial intelligence models, such as ChatGPT (version of 29 April 2024), have prompted interest from numerous domains of medicine, such as ophthalmology. As such, research is necessary to further assess its potential while simultaneously evaluating its shortcomings. Our study thus evaluates ChatGPT-4's performance on the American Academy of Ophthalmology's (AAO) Basic and Clinical Science Course (BCSC) Self-Assessment Program, focusing on its image recognition capabilities and its enhancement with domain-specific pretraining. : The chatbot was tested on 1300 BCSC Self-Assessment Program questions, including text and image-based questions. Domain-specific pretraining was tested for performance improvements. The primary outcome was the model's accuracy when presented with text and image-based multiple choice questions. Logistic regression and post hoc analyzes examined performance variations by question difficulty, image presence, and subspecialties. : The chatbot achieved an average accuracy of 78% compared with the average test-taker score of 74%. The repeatability kappa was 0.85 (95% CI: 0.82-0.87). Following domain-specific pretraining, the model's overall accuracy increased to 85%. The accuracy of the model's responses first depends on question difficulty (LR = 366), followed by image presence (LR = 108) and exam section (LR = 79). : The chatbot appeared to be similar or superior to human trainee test takers in ophthalmology, even with image recognition questions. Domain-specific training appeared to have improved accuracy. While these results do not necessarily imply that the chatbot has the comprehensive skill level of a human ophthalmologist, the results suggest there may be educational value to these tools if additional investigations provide similar results.

摘要

近年来,诸如ChatGPT(2024年4月29日版本)等人工智能模型的快速发展引发了医学众多领域(如眼科)的关注。因此,有必要进行研究以进一步评估其潜力,同时评估其缺点。我们的研究旨在评估ChatGPT-4在美国眼科学会(AAO)基础与临床科学课程(BCSC)自我评估项目中的表现,重点关注其图像识别能力以及通过特定领域预训练的提升效果。

该聊天机器人在1300道BCSC自我评估项目问题上进行了测试,包括基于文本和图像的问题。对特定领域预训练进行了性能改进测试。主要结果是该模型在面对基于文本和图像的多项选择题时的准确率。逻辑回归和事后分析研究了按问题难度、图像存在情况和亚专业划分的性能差异。

与平均考生得分74%相比,该聊天机器人的平均准确率为78%。重复性kappa值为0.85(95%置信区间:0.82 - 0.87)。经过特定领域预训练后,该模型的整体准确率提高到了85%。该模型回答的准确率首先取决于问题难度(似然比 = 366),其次是图像存在情况(似然比 = 108)和考试部分(似然比 = 79)。

即使在图像识别问题上,该聊天机器人在眼科方面似乎与人类实习考生相似或更胜一筹。特定领域训练似乎提高了准确率。虽然这些结果不一定意味着该聊天机器人具有人类眼科医生的综合技能水平,但如果进一步调查得出类似结果,这些工具可能具有教育价值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/249f/12293433/8ad59056c284/diagnostics-15-01820-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/249f/12293433/93f18c0b309b/diagnostics-15-01820-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/249f/12293433/1eef1d4e8cb7/diagnostics-15-01820-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/249f/12293433/8ad59056c284/diagnostics-15-01820-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/249f/12293433/93f18c0b309b/diagnostics-15-01820-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/249f/12293433/1eef1d4e8cb7/diagnostics-15-01820-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/249f/12293433/8ad59056c284/diagnostics-15-01820-g003.jpg

相似文献

1
Evaluating ChatGPT-4 Plus in Ophthalmology: Effect of Image Recognition and Domain-Specific Pretraining on Diagnostic Performance.评估眼科领域的ChatGPT-4 Plus:图像识别和特定领域预训练对诊断性能的影响。
Diagnostics (Basel). 2025 Jul 19;15(14):1820. doi: 10.3390/diagnostics15141820.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Unveiling GPT-4V's hidden challenges behind high accuracy on USMLE questions: Observational Study.揭示GPT-4V在美国医师执照考试(USMLE)问题上高精度背后的隐藏挑战:观察性研究。
J Med Internet Res. 2025 Feb 7;27:e65146. doi: 10.2196/65146.
4
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
5
Comparative performance of ChatGPT, Gemini, and final-year emergency medicine clerkship students in answering multiple-choice questions: implications for the use of AI in medical education.ChatGPT、Gemini与急诊医学实习最后一年学生在回答多项选择题方面的表现比较:人工智能在医学教育中的应用启示
Int J Emerg Med. 2025 Aug 7;18(1):146. doi: 10.1186/s12245-025-00949-6.
6
Sexual Harassment and Prevention Training性骚扰与预防培训
7
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
8
Can ChatGPT be trusted as a resource for a scholarly article on treatment planning implant-supported prostheses?ChatGPT能否被视为关于种植体支持修复体治疗计划的学术文章的可靠资源?
J Prosthet Dent. 2025 Apr 9. doi: 10.1016/j.prosdent.2025.03.025.
9
Artificial Intelligence in Orthopaedics: Performance of ChatGPT on Text and Image Questions on a Complete AAOS Orthopaedic In-Training Examination (OITE).人工智能在骨科领域的应用:ChatGPT 在 AAOS 骨科住院医师培训考试(OITE)全题文本和图像问题上的表现。
J Surg Educ. 2024 Nov;81(11):1645-1649. doi: 10.1016/j.jsurg.2024.08.002. Epub 2024 Sep 14.
10
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.利用预后信息为乳腺癌患者选择辅助性全身治疗的成本效益
Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.

本文引用的文献

1
ChatGPT Assisting Diagnosis of Neuro-Ophthalmology Diseases Based on Case Reports.基于病例报告的ChatGPT辅助诊断神经眼科疾病
J Neuroophthalmol. 2024 Oct 10;45(3):301-306. doi: 10.1097/WNO.0000000000002274.
2
ChatGPT Assisting Diagnosis of Neuro-Ophthalmology Diseases Based on Case Reports.基于病例报告的ChatGPT辅助神经眼科疾病诊断
J Neuroophthalmol. 2024 Oct 10. doi: 10.1097/WNO.0000000000002274.
3
Performance of Chatgpt in ophthalmology exam; human versus AI.Chatgpt 在眼科考试中的表现;人类与 AI 相比。
Int Ophthalmol. 2024 Nov 6;44(1):413. doi: 10.1007/s10792-024-03353-w.
4
Analysis of ChatGPT Responses to Ophthalmic Cases: Can ChatGPT Think like an Ophthalmologist?ChatGPT对眼科病例的回答分析:ChatGPT能像眼科医生一样思考吗?
Ophthalmol Sci. 2024 Aug 23;5(1):100600. doi: 10.1016/j.xops.2024.100600. eCollection 2025 Jan-Feb.
5
ChatGPT and retinal disease: a cross-sectional study on AI comprehension of clinical guidelines.ChatGPT与视网膜疾病:关于人工智能对临床指南理解的横断面研究
Can J Ophthalmol. 2025 Feb;60(1):e117-e123. doi: 10.1016/j.jcjo.2024.06.001. Epub 2024 Aug 1.
6
From text to image: challenges in integrating vision into ChatGPT for medical image interpretation.从文本到图像:将视觉融入ChatGPT进行医学图像解读面临的挑战。
Neural Regen Res. 2025 Feb 1;20(2):487-488. doi: 10.4103/NRR.NRR-D-24-00165. Epub 2024 Apr 3.
7
Performance of ChatGPT in Diagnosis of Corneal Eye Diseases.ChatGPT 在角膜眼病诊断中的表现。
Cornea. 2024 May 1;43(5):664-670. doi: 10.1097/ICO.0000000000003492. Epub 2024 Feb 23.
8
Reliability and accuracy of artificial intelligence ChatGPT in providing information on ophthalmic diseases and management to patients.人工智能 ChatGPT 在为患者提供眼科疾病信息和管理方面的可靠性和准确性。
Eye (Lond). 2024 May;38(7):1368-1373. doi: 10.1038/s41433-023-02906-0. Epub 2024 Jan 20.
9
Performance of ChatGPT on Ophthalmology-Related Questions Across Various Examination Levels: Observational Study.ChatGPT 在不同考试级别的眼科相关问题上的表现:观察性研究。
JMIR Med Educ. 2024 Jan 18;10:e50842. doi: 10.2196/50842.
10
Diagnostic capabilities of ChatGPT in ophthalmology.ChatGPT 在眼科诊断中的应用能力。
Graefes Arch Clin Exp Ophthalmol. 2024 Jul;262(7):2345-2352. doi: 10.1007/s00417-023-06363-z. Epub 2024 Jan 6.