• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估 ChatGPT 在医学教育中的能力:与三年级医学生在肺病学考试中的比较分析。

Appraisal of ChatGPT's Aptitude for Medical Education: Comparative Analysis With Third-Year Medical Students in a Pulmonology Examination.

机构信息

Faculté de Médecine de Tunis, Université de Tunis El Manar, Tunis, Tunisia.

出版信息

JMIR Med Educ. 2024 Jul 23;10:e52818. doi: 10.2196/52818.

DOI:10.2196/52818
PMID:39042876
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11303904/
Abstract

BACKGROUND

The rapid evolution of ChatGPT has generated substantial interest and led to extensive discussions in both public and academic domains, particularly in the context of medical education.

OBJECTIVE

This study aimed to evaluate ChatGPT's performance in a pulmonology examination through a comparative analysis with that of third-year medical students.

METHODS

In this cross-sectional study, we conducted a comparative analysis with 2 distinct groups. The first group comprised 244 third-year medical students who had previously taken our institution's 2020 pulmonology examination, which was conducted in French. The second group involved ChatGPT-3.5 in 2 separate sets of conversations: without contextualization (V1) and with contextualization (V2). In both V1 and V2, ChatGPT received the same set of questions administered to the students.

RESULTS

V1 demonstrated exceptional proficiency in radiology, microbiology, and thoracic surgery, surpassing the majority of medical students in these domains. However, it faced challenges in pathology, pharmacology, and clinical pneumology. In contrast, V2 consistently delivered more accurate responses across various question categories, regardless of the specialization. ChatGPT exhibited suboptimal performance in multiple choice questions compared to medical students. V2 excelled in responding to structured open-ended questions. Both ChatGPT conversations, particularly V2, outperformed students in addressing questions of low and intermediate difficulty. Interestingly, students showcased enhanced proficiency when confronted with highly challenging questions. V1 fell short of passing the examination. Conversely, V2 successfully achieved examination success, outperforming 139 (62.1%) medical students.

CONCLUSIONS

While ChatGPT has access to a comprehensive web-based data set, its performance closely mirrors that of an average medical student. Outcomes are influenced by question format, item complexity, and contextual nuances. The model faces challenges in medical contexts requiring information synthesis, advanced analytical aptitude, and clinical judgment, as well as in non-English language assessments and when confronted with data outside mainstream internet sources.

摘要

背景

ChatGPT 的快速发展引起了公众和学术界的广泛关注和讨论,尤其是在医学教育领域。

目的

通过与三年级医学生的比较分析,评估 ChatGPT 在肺病学考试中的表现。

方法

在这项横断面研究中,我们对两个不同的组进行了比较分析。第一组包括 244 名三年级医学生,他们之前参加过我们机构 2020 年的法语肺病学考试。第二组包括 ChatGPT-3.5,在两组独立的对话中:无上下文(V1)和有上下文(V2)。在 V1 和 V2 中,ChatGPT 都收到了与学生相同的问题集。

结果

V1 在放射学、微生物学和胸外科方面表现出色,在这些领域超过了大多数医学生。然而,它在病理学、药理学和临床肺科学方面遇到了挑战。相比之下,V2 在各个问题类别中始终提供更准确的回答,无论专业如何。与医学生相比,ChatGPT 在多项选择题中的表现不佳。V2 在回答结构化的开放式问题方面表现出色。ChatGPT 的两个对话,尤其是 V2,在回答低难度和中等难度的问题方面表现优于学生。有趣的是,学生在面对高难度问题时表现出更高的熟练度。V1 未能通过考试。相反,V2 成功通过了考试,超过了 139 名(62.1%)医学生。

结论

虽然 ChatGPT 可以访问全面的基于网络的数据集,但它的表现与平均医学生非常相似。结果受到问题格式、项目复杂性和上下文细微差别的影响。该模型在需要信息综合、高级分析能力和临床判断的医学背景下,以及在非英语语言评估和遇到主流互联网来源之外的数据时,都面临挑战。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f3a/11303904/b8f6b01d4e77/mededu_v10i1e52818_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f3a/11303904/c6efc962b089/mededu_v10i1e52818_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f3a/11303904/e734d06883c2/mededu_v10i1e52818_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f3a/11303904/b8f6b01d4e77/mededu_v10i1e52818_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f3a/11303904/c6efc962b089/mededu_v10i1e52818_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f3a/11303904/e734d06883c2/mededu_v10i1e52818_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f3a/11303904/b8f6b01d4e77/mededu_v10i1e52818_fig3.jpg

相似文献

1
Appraisal of ChatGPT's Aptitude for Medical Education: Comparative Analysis With Third-Year Medical Students in a Pulmonology Examination.评估 ChatGPT 在医学教育中的能力:与三年级医学生在肺病学考试中的比较分析。
JMIR Med Educ. 2024 Jul 23;10:e52818. doi: 10.2196/52818.
2
Performance of ChatGPT Across Different Versions in Medical Licensing Examinations Worldwide: Systematic Review and Meta-Analysis.ChatGPT 在全球医学执照考试不同版本中的表现:系统评价和荟萃分析。
J Med Internet Res. 2024 Jul 25;26:e60807. doi: 10.2196/60807.
3
How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment.ChatGPT在美国医师执照考试(USMLE)中的表现如何?大语言模型对医学教育和知识评估的影响。
JMIR Med Educ. 2023 Feb 8;9:e45312. doi: 10.2196/45312.
4
Performance of ChatGPT on the Chinese Postgraduate Examination for Clinical Medicine: Survey Study.ChatGPT 在临床医学研究生入学考试中的表现:调查研究。
JMIR Med Educ. 2024 Feb 9;10:e48514. doi: 10.2196/48514.
5
Exploring the Performance of ChatGPT Versions 3.5, 4, and 4 With Vision in the Chilean Medical Licensing Examination: Observational Study.探讨 ChatGPT 版本 3.5、4 和 4 与 Vision 在智利医师执照考试中的表现:观察性研究。
JMIR Med Educ. 2024 Apr 29;10:e55048. doi: 10.2196/55048.
6
Comparison of the Performance of GPT-3.5 and GPT-4 With That of Medical Students on the Written German Medical Licensing Examination: Observational Study.GPT-3.5 和 GPT-4 与医学生在书面德语文凭考试中的表现比较:观察性研究。
JMIR Med Educ. 2024 Feb 8;10:e50965. doi: 10.2196/50965.
7
Integrating ChatGPT in Orthopedic Education for Medical Undergraduates: Randomized Controlled Trial.将 ChatGPT 融入骨科医学本科生教育:随机对照试验。
J Med Internet Res. 2024 Aug 20;26:e57037. doi: 10.2196/57037.
8
Performance of ChatGPT on Nursing Licensure Examinations in the United States and China: Cross-Sectional Study.ChatGPT 在中美护理执照考试中的表现:横断面研究。
JMIR Med Educ. 2024 Oct 3;10:e52746. doi: 10.2196/52746.
9
ChatGPT's performance in German OB/GYN exams - paving the way for AI-enhanced medical education and clinical practice.ChatGPT在德国妇产科考试中的表现——为人工智能强化医学教育和临床实践铺平道路。
Front Med (Lausanne). 2023 Dec 13;10:1296615. doi: 10.3389/fmed.2023.1296615. eCollection 2023.
10
ChatGPT's performance in dentistry and allergyimmunology assessments: a comparative study.ChatGPT 在牙科和过敏免疫评估中的表现:一项比较研究。
Swiss Dent J. 2023 Oct 4;134(2):1-17. doi: 10.61872/sdj-2024-06-01.

引用本文的文献

1
Evaluating the Use of ChatGPT 3.5 and Bard as Self-Assessment Tools for Short Answer Questions in Undergraduate Ophthalmology.评估ChatGPT 3.5和Bard作为本科眼科简答题自我评估工具的使用情况。
Cureus. 2025 Jun 18;17(6):e86288. doi: 10.7759/cureus.86288. eCollection 2025 Jun.
2
Comparison of ChatGPT and Internet Research for Clinical Research and Decision-Making in Occupational Medicine: Randomized Controlled Trial.ChatGPT与互联网搜索用于职业医学临床研究和决策的比较:随机对照试验
JMIR Form Res. 2025 May 20;9:e63857. doi: 10.2196/63857.
3
ChatGPT's Performance on Portuguese Medical Examination Questions: Comparative Analysis of ChatGPT-3.5 Turbo and ChatGPT-4o Mini.

本文引用的文献

1
Diagnostic and Management Applications of ChatGPT in Structured Otolaryngology Clinical Scenarios.ChatGPT在结构化耳鼻喉科临床场景中的诊断与管理应用
OTO Open. 2023 Aug 22;7(3):e67. doi: 10.1002/oto2.67. eCollection 2023 Jul-Sep.
2
ChatGPT Performs on the Chinese National Medical Licensing Examination.ChatGPT 通过中国医师资格考试。
J Med Syst. 2023 Aug 15;47(1):86. doi: 10.1007/s10916-023-01961-0.
3
Artificial intelligence in orthopaedics: can Chat Generative Pre-trained Transformer (ChatGPT) pass Section 1 of the Fellowship of the Royal College of Surgeons (Trauma & Orthopaedics) examination?
ChatGPT在葡萄牙语医学考试问题上的表现:ChatGPT-3.5 Turbo与ChatGPT-4o Mini的比较分析。
JMIR Med Educ. 2025 Mar 5;11:e65108. doi: 10.2196/65108.
人工智能在骨科领域的应用:ChatGPT 能否通过皇家外科学院(创伤与骨科)研究员资格 Section 1 考试?
Postgrad Med J. 2023 Sep 21;99(1176):1110-1114. doi: 10.1093/postmj/qgad053.
4
Performance of GPT-3.5 and GPT-4 on the Japanese Medical Licensing Examination: Comparison Study.GPT-3.5和GPT-4在日本医师执照考试中的表现:比较研究。
JMIR Med Educ. 2023 Jun 29;9:e48002. doi: 10.2196/48002.
5
ChatGPT can pass the AHA exams: Open-ended questions outperform multiple-choice format.ChatGPT 可以通过 AHA 考试:开放式问题优于多项选择题格式。
Resuscitation. 2023 Jul;188:109783. doi: 10.1016/j.resuscitation.2023.109783.
6
Evaluating the limits of AI in medical specialisation: ChatGPT's performance on the UK Neurology Specialty Certificate Examination.评估人工智能在医学专科领域的局限性:ChatGPT在英国神经学专科证书考试中的表现。
BMJ Neurol Open. 2023 Jun 15;5(1):e000451. doi: 10.1136/bmjno-2023-000451. eCollection 2023.
7
ChatGPT failed Taiwan's Family Medicine Board Exam.ChatGPT 未能通过台湾家庭医学专科医师甄试。
J Chin Med Assoc. 2023 Aug 1;86(8):762-766. doi: 10.1097/JCMA.0000000000000946. Epub 2023 Jun 9.
8
ChatGPT takes on the European Exam in Core Cardiology: an artificial intelligence success story?ChatGPT参加欧洲核心心脏病学考试:一个人工智能的成功故事?
Eur Heart J Digit Health. 2023 Apr 24;4(3):279-281. doi: 10.1093/ehjdh/ztad029. eCollection 2023 May.
9
Analysis of large-language model versus human performance for genetics questions.大语言模型与人类在遗传学问题表现上的分析。
Eur J Hum Genet. 2024 Apr;32(4):466-468. doi: 10.1038/s41431-023-01396-8. Epub 2023 May 29.
10
Performance of ChatGPT on the pharmacist licensing examination in Taiwan.ChatGPT 在台湾药剂师执照考试中的表现。
J Chin Med Assoc. 2023 Jul 1;86(7):653-658. doi: 10.1097/JCMA.0000000000000942. Epub 2023 Jul 5.