Suppr超能文献

比较 ChatGPT 和 GPT-4 在 USMLE 软技能评估中的表现。

Comparing ChatGPT and GPT-4 performance in USMLE soft skill assessments.

机构信息

Department of Diagnostic Imaging, Chaim Sheba Medical Center, Ramat Gan, Israel.

Faculty of Medicine, Tel-Aviv University, Tel-Aviv, Israel.

出版信息

Sci Rep. 2023 Oct 1;13(1):16492. doi: 10.1038/s41598-023-43436-9.

Abstract

The United States Medical Licensing Examination (USMLE) has been a subject of performance study for artificial intelligence (AI) models. However, their performance on questions involving USMLE soft skills remains unexplored. This study aimed to evaluate ChatGPT and GPT-4 on USMLE questions involving communication skills, ethics, empathy, and professionalism. We used 80 USMLE-style questions involving soft skills, taken from the USMLE website and the AMBOSS question bank. A follow-up query was used to assess the models' consistency. The performance of the AI models was compared to that of previous AMBOSS users. GPT-4 outperformed ChatGPT, correctly answering 90% compared to ChatGPT's 62.5%. GPT-4 showed more confidence, not revising any responses, while ChatGPT modified its original answers 82.5% of the time. The performance of GPT-4 was higher than that of AMBOSS's past users. Both AI models, notably GPT-4, showed capacity for empathy, indicating AI's potential to meet the complex interpersonal, ethical, and professional demands intrinsic to the practice of medicine.

摘要

美国医师执照考试(USMLE)一直是人工智能(AI)模型的性能研究课题。然而,它们在涉及 USMLE 软技能的问题上的表现仍未得到探索。本研究旨在评估 ChatGPT 和 GPT-4 在涉及沟通技巧、伦理、同理心和专业精神的 USMLE 问题上的表现。我们使用了 80 个来自 USMLE 网站和 AMBOSS 题库的涉及软技能的 USMLE 风格问题。后续查询用于评估模型的一致性。AI 模型的性能与之前的 AMBOSS 用户进行了比较。GPT-4 的表现优于 ChatGPT,正确回答了 90%的问题,而 ChatGPT 的正确回答率为 62.5%。GPT-4 表现出更高的信心,没有修改任何回答,而 ChatGPT 有 82.5%的时间修改了其原始回答。GPT-4 的表现高于 AMBOSS 过去用户的表现。这两个 AI 模型,特别是 GPT-4,都表现出了同理心的能力,这表明 AI 有潜力满足医学实践中固有的复杂的人际、伦理和专业要求。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/af26/10543445/903bd0cd1bf7/41598_2023_43436_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验