Suppr超能文献

ChatGPT在情景判断测试中的表现——英国针对医生的基于专业困境的考试

Performance of ChatGPT on the Situational Judgement Test-A Professional Dilemmas-Based Examination for Doctors in the United Kingdom.

作者信息

Borchert Robin J, Hickman Charlotte R, Pepys Jack, Sadler Timothy J

机构信息

Department of Radiology, University of Cambridge, Cambridge, United Kingdom.

Department of Radiology, Addenbrooke's Hospital, Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom.

出版信息

JMIR Med Educ. 2023 Aug 7;9:e48978. doi: 10.2196/48978.

Abstract

BACKGROUND

ChatGPT is a large language model that has performed well on professional examinations in the fields of medicine, law, and business. However, it is unclear how ChatGPT would perform on an examination assessing professionalism and situational judgement for doctors.

OBJECTIVE

We evaluated the performance of ChatGPT on the Situational Judgement Test (SJT): a national examination taken by all final-year medical students in the United Kingdom. This examination is designed to assess attributes such as communication, teamwork, patient safety, prioritization skills, professionalism, and ethics.

METHODS

All questions from the UK Foundation Programme Office's (UKFPO's) 2023 SJT practice examination were inputted into ChatGPT. For each question, ChatGPT's answers and rationales were recorded and assessed on the basis of the official UK Foundation Programme Office scoring template. Questions were categorized into domains of Good Medical Practice on the basis of the domains referenced in the rationales provided in the scoring sheet. Questions without clear domain links were screened by reviewers and assigned one or multiple domains. ChatGPT's overall performance, as well as its performance across the domains of Good Medical Practice, was evaluated.

RESULTS

Overall, ChatGPT performed well, scoring 76% on the SJT but scoring full marks on only a few questions (9%), which may reflect possible flaws in ChatGPT's situational judgement or inconsistencies in the reasoning across questions (or both) in the examination itself. ChatGPT demonstrated consistent performance across the 4 outlined domains in Good Medical Practice for doctors.

CONCLUSIONS

Further research is needed to understand the potential applications of large language models, such as ChatGPT, in medical education for standardizing questions and providing consistent rationales for examinations assessing professionalism and ethics.

摘要

背景

ChatGPT是一个大型语言模型,在医学、法律和商业领域的专业考试中表现出色。然而,目前尚不清楚ChatGPT在评估医生专业素养和情境判断能力的考试中表现如何。

目的

我们评估了ChatGPT在情境判断测试(SJT)中的表现,这是英国所有医学专业最后一年学生都要参加的全国性考试。该考试旨在评估沟通、团队合作、患者安全、优先级排序技能、专业素养和道德等属性。

方法

将英国基础项目办公室(UKFPO)2023年SJT实践考试的所有问题输入ChatGPT。对于每个问题,记录ChatGPT的答案和理由,并根据UKFPO官方评分模板进行评估。根据评分表中提供的理由所引用的领域,将问题归类为良好医疗实践的领域。没有明确领域联系的问题由评审人员筛选并分配一个或多个领域。评估了ChatGPT的整体表现及其在良好医疗实践各个领域的表现。

结果

总体而言,ChatGPT表现良好,在SJT中得分为76%,但只有少数问题(9%)得满分,这可能反映了ChatGPT情境判断中可能存在的缺陷或考试中问题推理的不一致性(或两者皆有)。ChatGPT在医生良好医疗实践的4个概述领域中表现出一致的性能。

结论

需要进一步研究以了解大型语言模型(如ChatGPT)在医学教育中的潜在应用,用于标准化问题并为评估专业素养和道德的考试提供一致的理由。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a788/10442724/081a7c550493/mededu_v9i1e48978_fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验