文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

评估 ChatGPT 在医学教育中的能力:与三年级医学生在肺病学考试中的比较分析。

Appraisal of ChatGPT's Aptitude for Medical Education: Comparative Analysis With Third-Year Medical Students in a Pulmonology Examination.

机构信息

Faculté de Médecine de Tunis, Université de Tunis El Manar, Tunis, Tunisia.

出版信息

JMIR Med Educ. 2024 Jul 23;10:e52818. doi: 10.2196/52818.


DOI:10.2196/52818
PMID:39042876
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11303904/
Abstract

BACKGROUND: The rapid evolution of ChatGPT has generated substantial interest and led to extensive discussions in both public and academic domains, particularly in the context of medical education. OBJECTIVE: This study aimed to evaluate ChatGPT's performance in a pulmonology examination through a comparative analysis with that of third-year medical students. METHODS: In this cross-sectional study, we conducted a comparative analysis with 2 distinct groups. The first group comprised 244 third-year medical students who had previously taken our institution's 2020 pulmonology examination, which was conducted in French. The second group involved ChatGPT-3.5 in 2 separate sets of conversations: without contextualization (V1) and with contextualization (V2). In both V1 and V2, ChatGPT received the same set of questions administered to the students. RESULTS: V1 demonstrated exceptional proficiency in radiology, microbiology, and thoracic surgery, surpassing the majority of medical students in these domains. However, it faced challenges in pathology, pharmacology, and clinical pneumology. In contrast, V2 consistently delivered more accurate responses across various question categories, regardless of the specialization. ChatGPT exhibited suboptimal performance in multiple choice questions compared to medical students. V2 excelled in responding to structured open-ended questions. Both ChatGPT conversations, particularly V2, outperformed students in addressing questions of low and intermediate difficulty. Interestingly, students showcased enhanced proficiency when confronted with highly challenging questions. V1 fell short of passing the examination. Conversely, V2 successfully achieved examination success, outperforming 139 (62.1%) medical students. CONCLUSIONS: While ChatGPT has access to a comprehensive web-based data set, its performance closely mirrors that of an average medical student. Outcomes are influenced by question format, item complexity, and contextual nuances. The model faces challenges in medical contexts requiring information synthesis, advanced analytical aptitude, and clinical judgment, as well as in non-English language assessments and when confronted with data outside mainstream internet sources.

摘要

背景:ChatGPT 的快速发展引起了公众和学术界的广泛关注和讨论,尤其是在医学教育领域。

目的:通过与三年级医学生的比较分析,评估 ChatGPT 在肺病学考试中的表现。

方法:在这项横断面研究中,我们对两个不同的组进行了比较分析。第一组包括 244 名三年级医学生,他们之前参加过我们机构 2020 年的法语肺病学考试。第二组包括 ChatGPT-3.5,在两组独立的对话中:无上下文(V1)和有上下文(V2)。在 V1 和 V2 中,ChatGPT 都收到了与学生相同的问题集。

结果:V1 在放射学、微生物学和胸外科方面表现出色,在这些领域超过了大多数医学生。然而,它在病理学、药理学和临床肺科学方面遇到了挑战。相比之下,V2 在各个问题类别中始终提供更准确的回答,无论专业如何。与医学生相比,ChatGPT 在多项选择题中的表现不佳。V2 在回答结构化的开放式问题方面表现出色。ChatGPT 的两个对话,尤其是 V2,在回答低难度和中等难度的问题方面表现优于学生。有趣的是,学生在面对高难度问题时表现出更高的熟练度。V1 未能通过考试。相反,V2 成功通过了考试,超过了 139 名(62.1%)医学生。

结论:虽然 ChatGPT 可以访问全面的基于网络的数据集,但它的表现与平均医学生非常相似。结果受到问题格式、项目复杂性和上下文细微差别的影响。该模型在需要信息综合、高级分析能力和临床判断的医学背景下,以及在非英语语言评估和遇到主流互联网来源之外的数据时,都面临挑战。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f3a/11303904/b8f6b01d4e77/mededu_v10i1e52818_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f3a/11303904/c6efc962b089/mededu_v10i1e52818_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f3a/11303904/e734d06883c2/mededu_v10i1e52818_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f3a/11303904/b8f6b01d4e77/mededu_v10i1e52818_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f3a/11303904/c6efc962b089/mededu_v10i1e52818_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f3a/11303904/e734d06883c2/mededu_v10i1e52818_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6f3a/11303904/b8f6b01d4e77/mededu_v10i1e52818_fig3.jpg

相似文献

[1]
Appraisal of ChatGPT's Aptitude for Medical Education: Comparative Analysis With Third-Year Medical Students in a Pulmonology Examination.

JMIR Med Educ. 2024-7-23

[2]
Performance of ChatGPT Across Different Versions in Medical Licensing Examinations Worldwide: Systematic Review and Meta-Analysis.

J Med Internet Res. 2024-7-25

[3]
How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment.

JMIR Med Educ. 2023-2-8

[4]
Performance of ChatGPT on the Chinese Postgraduate Examination for Clinical Medicine: Survey Study.

JMIR Med Educ. 2024-2-9

[5]
Exploring the Performance of ChatGPT Versions 3.5, 4, and 4 With Vision in the Chilean Medical Licensing Examination: Observational Study.

JMIR Med Educ. 2024-4-29

[6]
Comparison of the Performance of GPT-3.5 and GPT-4 With That of Medical Students on the Written German Medical Licensing Examination: Observational Study.

JMIR Med Educ. 2024-2-8

[7]
Integrating ChatGPT in Orthopedic Education for Medical Undergraduates: Randomized Controlled Trial.

J Med Internet Res. 2024-8-20

[8]
Performance of ChatGPT on Nursing Licensure Examinations in the United States and China: Cross-Sectional Study.

JMIR Med Educ. 2024-10-3

[9]
ChatGPT's performance in German OB/GYN exams - paving the way for AI-enhanced medical education and clinical practice.

Front Med (Lausanne). 2023-12-13

[10]
ChatGPT's performance in dentistry and allergyimmunology assessments: a comparative study.

Swiss Dent J. 2023-10-4

引用本文的文献

[1]
Evaluating the Use of ChatGPT 3.5 and Bard as Self-Assessment Tools for Short Answer Questions in Undergraduate Ophthalmology.

Cureus. 2025-6-18

[2]
Comparison of ChatGPT and Internet Research for Clinical Research and Decision-Making in Occupational Medicine: Randomized Controlled Trial.

JMIR Form Res. 2025-5-20

[3]
ChatGPT's Performance on Portuguese Medical Examination Questions: Comparative Analysis of ChatGPT-3.5 Turbo and ChatGPT-4o Mini.

JMIR Med Educ. 2025-3-5

本文引用的文献

[1]
Diagnostic and Management Applications of ChatGPT in Structured Otolaryngology Clinical Scenarios.

OTO Open. 2023-8-22

[2]
ChatGPT Performs on the Chinese National Medical Licensing Examination.

J Med Syst. 2023-8-15

[3]
Artificial intelligence in orthopaedics: can Chat Generative Pre-trained Transformer (ChatGPT) pass Section 1 of the Fellowship of the Royal College of Surgeons (Trauma & Orthopaedics) examination?

Postgrad Med J. 2023-9-21

[4]
Performance of GPT-3.5 and GPT-4 on the Japanese Medical Licensing Examination: Comparison Study.

JMIR Med Educ. 2023-6-29

[5]
ChatGPT can pass the AHA exams: Open-ended questions outperform multiple-choice format.

Resuscitation. 2023-7

[6]
Evaluating the limits of AI in medical specialisation: ChatGPT's performance on the UK Neurology Specialty Certificate Examination.

BMJ Neurol Open. 2023-6-15

[7]
ChatGPT failed Taiwan's Family Medicine Board Exam.

J Chin Med Assoc. 2023-8-1

[8]
ChatGPT takes on the European Exam in Core Cardiology: an artificial intelligence success story?

Eur Heart J Digit Health. 2023-4-24

[9]
Analysis of large-language model versus human performance for genetics questions.

Eur J Hum Genet. 2024-4

[10]
Performance of ChatGPT on the pharmacist licensing examination in Taiwan.

J Chin Med Assoc. 2023-7-1

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索