ChatGPT在卡塔尔急诊医学住院医师考试中的表现：与住院医师的比较分析。

Performance of ChatGPT in emergency medicine residency exams in Qatar: A comparative analysis with resident physicians.

作者信息

Iftikhar Haris, Anjum Shahzad, Bhutta Zain A, Najam Mavia, Bashir Khalid

机构信息

Emergency Medicine, Hamad General Hospital, Doha, Qatar *Email:

Department of Medical Education, Hamad Medical Corporation, Doha, Qatar.

出版信息

Qatar Med J. 2024 Nov 11;2024(4):61. doi: 10.5339/qmj.2024.61. eCollection 2024.

DOI:10.5339/qmj.2024.61

PMID:39552949

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11568194/

Abstract

INTRODUCTION

The inclusion of artificial intelligence (AI) in the healthcare sector has transformed medical practices by introducing innovative techniques for medical education, diagnosis, and treatment strategies. In medical education, the potential of AI to enhance learning and assessment methods is being increasingly recognized. This study aims to evaluate the performance of OpenAI's Chat Generative Pre-Trained Transformer (ChatGPT) in emergency medicine (EM) residency examinations in Qatar and compare it with the performance of resident physicians.

METHODS

A retrospective descriptive study with a mixed-methods design was conducted in August 2023. EM residents' examination scores were collected and compared with the performance of ChatGPT on the same examinations. The examinations consisted of multiple-choice questions (MCQs) from the same faculty responsible for Qatari Board EM examinations. ChatGPT's performance on these examinations was analyzed and compared with residents across various postgraduate years (PGY).

RESULTS

The study included 238 emergency department residents from PGY1 to PGY4 and compared their performances with ChatGPT. ChatGPT scored consistently higher than resident groups in all examination categories. However, a notable decline in passing rates was observed among senior residents, indicating a potential misalignment between examination performance and practical competencies. Another likely reason can be the impact of the COVID-19 pandemic on their learning experience, knowledge acquisition, and consolidation.

CONCLUSION

ChatGPT demonstrated significant proficiency in the theoretical knowledge of EM, outperforming resident physicians in examination settings. This finding suggests the potential of AI as a supplementary tool in medical education.

摘要

引言

医疗保健领域引入人工智能（AI），通过引入医学教育、诊断和治疗策略的创新技术，改变了医疗实践。在医学教育中，人工智能提升学习和评估方法的潜力正日益得到认可。本研究旨在评估OpenAI的聊天生成预训练变换器（ChatGPT）在卡塔尔急诊医学（EM）住院医师考试中的表现，并将其与住院医师的表现进行比较。

方法

2023年8月进行了一项采用混合方法设计的回顾性描述性研究。收集了急诊医学住院医师的考试成绩，并与ChatGPT在相同考试中的表现进行比较。考试由负责卡塔尔急诊医学委员会考试的同一教师团队提供的多项选择题（MCQ）组成。分析了ChatGPT在这些考试中的表现，并与不同研究生年级（PGY）的住院医师进行了比较。

结果

该研究纳入了238名从PGY1到PGY4的急诊科住院医师，并将他们的表现与ChatGPT进行了比较。在所有考试类别中，ChatGPT的得分始终高于住院医师组。然而，观察到高年级住院医师的及格率显著下降，这表明考试成绩与实际能力之间可能存在脱节。另一个可能的原因是新冠疫情对他们的学习经历、知识获取和巩固产生了影响。

结论

ChatGPT在急诊医学的理论知识方面表现出显著的熟练程度，在考试环境中优于住院医师。这一发现表明人工智能作为医学教育辅助工具的潜力。

相似文献

Performance of ChatGPT in emergency medicine residency exams in Qatar: A comparative analysis with resident physicians.ChatGPT在卡塔尔急诊医学住院医师考试中的表现：与住院医师的比较分析。

Qatar Med J. 2024 Nov 11;2024(4):61. doi: 10.5339/qmj.2024.61. eCollection 2024.

Exploring the Performance of ChatGPT Versions 3.5, 4, and 4 With Vision in the Chilean Medical Licensing Examination: Observational Study.探讨 ChatGPT 版本 3.5、4 和 4 与 Vision 在智利医师执照考试中的表现：观察性研究。

JMIR Med Educ. 2024 Apr 29;10:e55048. doi: 10.2196/55048.

ChatGPT's performance in German OB/GYN exams - paving the way for AI-enhanced medical education and clinical practice.ChatGPT在德国妇产科考试中的表现——为人工智能强化医学教育和临床实践铺平道路。

Front Med (Lausanne). 2023 Dec 13;10:1296615. doi: 10.3389/fmed.2023.1296615. eCollection 2023.

Can Artificial Intelligence Pass the American Board of Orthopaedic Surgery Examination? Orthopaedic Residents Versus ChatGPT.人工智能能通过美国骨科医师学会考试吗？骨科住院医师与ChatGPT的对比。

Clin Orthop Relat Res. 2023 Aug 1;481(8):1623-1630. doi: 10.1097/CORR.0000000000002704. Epub 2023 May 23.

Comparison of the Performance of Artificial Intelligence Versus Medical Professionals in the Polish Final Medical Examination.人工智能与医学专业人员在波兰医学期末考试中的表现比较

Cureus. 2024 Aug 2;16(8):e66011. doi: 10.7759/cureus.66011. eCollection 2024 Aug.

Performance of ChatGPT on Nursing Licensure Examinations in the United States and China: Cross-Sectional Study.ChatGPT 在中美护理执照考试中的表现：横断面研究。

JMIR Med Educ. 2024 Oct 3;10:e52746. doi: 10.2196/52746.

Performance of ChatGPT on the Chinese Postgraduate Examination for Clinical Medicine: Survey Study.ChatGPT 在临床医学研究生入学考试中的表现：调查研究。

JMIR Med Educ. 2024 Feb 9;10:e48514. doi: 10.2196/48514.

Appraisal of ChatGPT's Aptitude for Medical Education: Comparative Analysis With Third-Year Medical Students in a Pulmonology Examination.评估 ChatGPT 在医学教育中的能力：与三年级医学生在肺病学考试中的比较分析。

JMIR Med Educ. 2024 Jul 23;10:e52818. doi: 10.2196/52818.

Performance of ChatGPT Across Different Versions in Medical Licensing Examinations Worldwide: Systematic Review and Meta-Analysis.ChatGPT 在全球医学执照考试不同版本中的表现：系统评价和荟萃分析。

J Med Internet Res. 2024 Jul 25;26:e60807. doi: 10.2196/60807.

Is ChatGPT ready for primetime? Performance of artificial intelligence on a simulated Canadian urology board exam.ChatGPT 准备好正式登场了吗？人工智能在模拟加拿大泌尿外科委员会考试中的表现。

Can Urol Assoc J. 2024 Oct;18(10):329-332. doi: 10.5489/cuaj.8800.

引用本文的文献

Mapping artificial intelligence models in emergency medicine: A scoping review on artificial intelligence performance in emergency care and education.绘制急诊医学中的人工智能模型：关于人工智能在急诊护理和教育中表现的范围综述。

Turk J Emerg Med. 2025 Apr 1;25(2):67-91. doi: 10.4103/tjem.tjem_45_25. eCollection 2025 Apr-Jun.

本文引用的文献

GPT-4 passes the bar exam.GPT-4通过了律师资格考试。

Philos Trans A Math Phys Eng Sci. 2024 Apr 15;382(2270):20230254. doi: 10.1098/rsta.2023.0254. Epub 2024 Feb 26.

ChatGPT Versus Human Performance on Emergency Medicine Board Preparation Questions.ChatGPT与人类在急诊医学委员会备考问题上的表现对比。

Ann Emerg Med. 2024 Jan;83(1):87-88. doi: 10.1016/j.annemergmed.2023.08.010. Epub 2023 Sep 19.

Chatbots, ChatGPT, and Scholarly Manuscripts: WAME Recommendations on ChatGPT and Chatbots in relation to scholarly publications.聊天机器人、ChatGPT与学术手稿：世界医学编辑协会关于ChatGPT和聊天机器人在学术出版方面的建议

Natl Med J India. 2023 Jan-Feb;36(1):1-4. doi: 10.25259/NMJI_365_23.

ChatGPT Passes German State Examination in Medicine With Picture Questions Omitted.ChatGPT在省略图片问题的情况下通过了德国医学国家考试。

Dtsch Arztebl Int. 2023 May 30;120(21):373-374. doi: 10.3238/arztebl.m2023.0113.

ChatGPT takes on the European Exam in Core Cardiology: an artificial intelligence success story?ChatGPT参加欧洲核心心脏病学考试：一个人工智能的成功故事？

Eur Heart J Digit Health. 2023 Apr 24;4(3):279-281. doi: 10.1093/ehjdh/ztad029. eCollection 2023 May.

Artificial intelligence-enabled simulation of gluteal augmentation: A helpful tool in preoperative outcome simulation?人工智能辅助的臀隆模拟：术前结果模拟的有益工具？

J Plast Reconstr Aesthet Surg. 2023 May;80:94-101. doi: 10.1016/j.bjps.2023.01.039. Epub 2023 Feb 9.

Large language models (LLM) and ChatGPT: what will the impact on nuclear medicine be?大语言模型（LLM）和ChatGPT：对核医学将产生什么影响？

Eur J Nucl Med Mol Imaging. 2023 May;50(6):1549-1552. doi: 10.1007/s00259-023-06172-w. Epub 2023 Mar 9.

How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment.ChatGPT在美国医师执照考试（USMLE）中的表现如何？大语言模型对医学教育和知识评估的影响。

JMIR Med Educ. 2023 Feb 8;9:e45312. doi: 10.2196/45312.

Artificial Intelligence-Enabled Evaluation of Pain Sketches to Predict Outcomes in Headache Surgery.人工智能辅助的疼痛草图评估，预测头痛手术结局。

Plast Reconstr Surg. 2023 Feb 1;151(2):405-411. doi: 10.1097/PRS.0000000000009855. Epub 2022 Nov 15.

ChatGPT listed as author on research papers: many scientists disapprove.研究论文将ChatGPT列为作者：许多科学家表示反对。

Nature. 2023 Jan;613(7945):620-621. doi: 10.1038/d41586-023-00107-z.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。