• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ChatGPT-4在医师执照考试(OKAP)中的表现提升:与ChatGPT-3.5的对比研究

Improved Performance of ChatGPT-4 on the OKAP Examination: A Comparative Study with ChatGPT-3.5.

作者信息

Teebagy Sean, Colwell Lauren, Wood Emma, Yaghy Antonio, Faustina Misha

机构信息

Department of Ophthalmology and Visual Sciences, UMass Chan Medical School, Worcester, Massachusetts.

出版信息

J Acad Ophthalmol (2017). 2023 Sep 11;15(2):e184-e187. doi: 10.1055/s-0043-1774399. eCollection 2023 Jul.

DOI:10.1055/s-0043-1774399
PMID:37701862
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10495224/
Abstract

This study aims to evaluate the performance of ChatGPT-4, an advanced artificial intelligence (AI) language model, on the Ophthalmology Knowledge Assessment Program (OKAP) examination compared to its predecessor, ChatGPT-3.5.  Both models were tested on 180 OKAP practice questions covering various ophthalmology subject categories.  ChatGPT-4 significantly outperformed ChatGPT-3.5 (81% vs. 57%; <0.001), indicating improvements in medical knowledge assessment.  The superior performance of ChatGPT-4 suggests potential applicability in ophthalmologic education and clinical decision support systems. Future research should focus on refining AI models, ensuring a balanced representation of fundamental and specialized knowledge, and determining the optimal method of integrating AI into medical education and practice.

摘要

本研究旨在评估先进的人工智能(AI)语言模型ChatGPT-4在眼科知识评估计划(OKAP)考试中的表现,并与它的前身ChatGPT-3.5进行比较。两个模型都在涵盖各种眼科主题类别的180道OKAP练习题上进行了测试。ChatGPT-4的表现显著优于ChatGPT-3.5(81%对57%;<0.001),表明在医学知识评估方面有所改进。ChatGPT-4的卓越表现表明其在眼科教育和临床决策支持系统中具有潜在的适用性。未来的研究应专注于改进人工智能模型,确保基础知识和专业知识的平衡呈现,并确定将人工智能整合到医学教育和实践中的最佳方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1684/10495224/82cb9859803f/10-1055-s-0043-1774399-i425-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1684/10495224/82cb9859803f/10-1055-s-0043-1774399-i425-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1684/10495224/82cb9859803f/10-1055-s-0043-1774399-i425-1.jpg

相似文献

1
Improved Performance of ChatGPT-4 on the OKAP Examination: A Comparative Study with ChatGPT-3.5.ChatGPT-4在医师执照考试(OKAP)中的表现提升:与ChatGPT-3.5的对比研究
J Acad Ophthalmol (2017). 2023 Sep 11;15(2):e184-e187. doi: 10.1055/s-0043-1774399. eCollection 2023 Jul.
2
Comparison of Gemini Advanced and ChatGPT 4.0's Performances on the Ophthalmology Resident Ophthalmic Knowledge Assessment Program (OKAP) Examination Review Question Banks.Gemini Advanced与ChatGPT 4.0在眼科住院医师眼科知识评估计划(OKAP)考试复习题库中的表现比较。
Cureus. 2024 Sep 17;16(9):e69612. doi: 10.7759/cureus.69612. eCollection 2024 Sep.
3
Success of ChatGPT, an AI language model, in taking the French language version of the European Board of Ophthalmology examination: A novel approach to medical knowledge assessment.ChatGPT 人工智能语言模型成功通过欧洲眼科委员会法语考试:医学知识评估的新方法。
J Fr Ophtalmol. 2023 Sep;46(7):706-711. doi: 10.1016/j.jfo.2023.05.006. Epub 2023 Aug 1.
4
Gemini AI vs. ChatGPT: A comprehensive examination alongside ophthalmology residents in medical knowledge.Gemini人工智能与ChatGPT对比:与眼科住院医师一起对医学知识进行的全面考察
Graefes Arch Clin Exp Ophthalmol. 2025 Feb;263(2):527-536. doi: 10.1007/s00417-024-06625-4. Epub 2024 Sep 15.
5
Evaluating the Performance of ChatGPT in Ophthalmology: An Analysis of Its Successes and Shortcomings.评估ChatGPT在眼科领域的表现:对其优缺点的分析。
Ophthalmol Sci. 2023 May 5;3(4):100324. doi: 10.1016/j.xops.2023.100324. eCollection 2023 Dec.
6
Development and Evaluation of Aeyeconsult: A Novel Ophthalmology Chatbot Leveraging Verified Textbook Knowledge and GPT-4.Aeyeconsult的开发与评估:一种利用权威教科书知识和GPT-4的新型眼科聊天机器人
J Surg Educ. 2024 Mar;81(3):438-443. doi: 10.1016/j.jsurg.2023.11.019. Epub 2023 Dec 21.
7
Performance of ChatGPT on Ophthalmology-Related Questions Across Various Examination Levels: Observational Study.ChatGPT 在不同考试级别的眼科相关问题上的表现:观察性研究。
JMIR Med Educ. 2024 Jan 18;10:e50842. doi: 10.2196/50842.
8
A multicenter analysis of the ophthalmic knowledge assessment program and American Board of Ophthalmology written qualifying examination performance.多中心眼科知识评估计划和美国眼科学董事会笔试成绩分析。
Ophthalmology. 2012 Oct;119(10):1949-53. doi: 10.1016/j.ophtha.2012.06.010. Epub 2012 Jul 28.
9
Assessing the Capability of ChatGPT in Answering First- and Second-Order Knowledge Questions on Microbiology as per Competency-Based Medical Education Curriculum.根据基于能力的医学教育课程评估ChatGPT回答微生物学一阶和二阶知识问题的能力。
Cureus. 2023 Mar 12;15(3):e36034. doi: 10.7759/cureus.36034. eCollection 2023 Mar.
10
ChatGPT Conquers the Saudi Medical Licensing Exam: Exploring the Accuracy of Artificial Intelligence in Medical Knowledge Assessment and Implications for Modern Medical Education.ChatGPT攻克沙特医学执照考试:探索人工智能在医学知识评估中的准确性及其对现代医学教育的影响
Cureus. 2023 Sep 11;15(9):e45043. doi: 10.7759/cureus.45043. eCollection 2023 Sep.

引用本文的文献

1
ChatGPT-4o and OpenAI-o1: A Comparative Analysis of Its Accuracy in Refractive Surgery.ChatGPT-4o与OpenAI-o1:屈光手术中其准确性的比较分析。
J Clin Med. 2025 Jul 22;14(15):5175. doi: 10.3390/jcm14155175.
2
Evaluating ChatGPT-4 Plus in Ophthalmology: Effect of Image Recognition and Domain-Specific Pretraining on Diagnostic Performance.评估眼科领域的ChatGPT-4 Plus:图像识别和特定领域预训练对诊断性能的影响。
Diagnostics (Basel). 2025 Jul 19;15(14):1820. doi: 10.3390/diagnostics15141820.
3
EYE-Llama, an in-domain large language model for ophthalmology.

本文引用的文献

1
Evaluating the Performance of ChatGPT in Ophthalmology: An Analysis of Its Successes and Shortcomings.评估ChatGPT在眼科领域的表现:对其优缺点的分析。
Ophthalmol Sci. 2023 May 5;3(4):100324. doi: 10.1016/j.xops.2023.100324. eCollection 2023 Dec.
2
Can artificial intelligence pass the Fellowship of the Royal College of Radiologists examination? Multi-reader diagnostic accuracy study.人工智能能否通过皇家放射学院院士考试?多读者诊断准确性研究。
BMJ. 2022 Dec 21;379:e072826. doi: 10.1136/bmj-2022-072826.
3
The Pursuit of Generalizability and Equity Through Artificial Intelligence-Based Risk Prediction Models.
EYE-Llama,一种用于眼科领域的大语言模型。
iScience. 2025 Jun 23;28(7):112984. doi: 10.1016/j.isci.2025.112984. eCollection 2025 Jul 18.
4
Utilizing ChatGPT-3.5 to Assist Ophthalmologists in Clinical Decision-making.利用ChatGPT-3.5辅助眼科医生进行临床决策。
J Ophthalmic Vis Res. 2025 May 5;20. doi: 10.18502/jovr.v20.14692. eCollection 2025.
5
Evaluation and comparison of large language models' responses to questions related optic neuritis.大语言模型对与视神经炎相关问题的回答的评估与比较
Front Med (Lausanne). 2025 Jun 25;12:1516442. doi: 10.3389/fmed.2025.1516442. eCollection 2025.
6
Evaluating the accuracy of advanced language learning models in ophthalmology: A comparative study of ChatGPT-4o and Meta AI's Llama 3.1.评估先进语言学习模型在眼科领域的准确性:ChatGPT-4o与Meta AI的Llama 3.1的比较研究
Adv Ophthalmol Pract Res. 2025 Jan 6;5(2):95-99. doi: 10.1016/j.aopr.2025.01.002. eCollection 2025 May-Jun.
7
Accuracy of Large Language Models When Answering Clinical Research Questions: Systematic Review and Network Meta-Analysis.大型语言模型回答临床研究问题的准确性:系统评价与网络荟萃分析
J Med Internet Res. 2025 Apr 30;27:e64486. doi: 10.2196/64486.
8
ChatGPT and Other Large Language Models in Medical Education - Scoping Literature Review.医学教育中的ChatGPT及其他大语言模型——文献综述
Med Sci Educ. 2024 Nov 13;35(1):555-567. doi: 10.1007/s40670-024-02206-6. eCollection 2025 Feb.
9
ChatGPT-4 Omni's superiority in answering multiple-choice oral radiology questions.ChatGPT-4 Omni在回答口腔放射学选择题方面的优势。
BMC Oral Health. 2025 Feb 1;25(1):173. doi: 10.1186/s12903-025-05554-w.
10
Evaluating the Performance of ChatGPT 3.5 and 4.0 on StatPearls Oculoplastic Surgery Text- and Image-Based Exam Questions.评估ChatGPT 3.5和4.0在StatPearls眼整形手术基于文本和图像的考试问题上的表现。
Cureus. 2024 Nov 16;16(11):e73812. doi: 10.7759/cureus.73812. eCollection 2024 Nov.
通过基于人工智能的风险预测模型追求普遍性和公平性
JAMA Ophthalmol. 2022 Aug 1;140(8):798-799. doi: 10.1001/jamaophthalmol.2022.2139.
4
Do no harm: a roadmap for responsible machine learning for health care.《医疗保健负责任机器学习的路线图:不造成伤害》。
Nat Med. 2019 Sep;25(9):1337-1340. doi: 10.1038/s41591-019-0548-6. Epub 2019 Aug 19.