人工智能与住院医师在上肢骨科培训考试问题上表现的比较。

PURPOSE: Currently, there is a paucity of prior investigations and studies examining applications for artificial intelligence (AI) in upper-extremity (UE) surgical education. The purpose of this investigation was to assess the performance of a novel AI tool (ChatGPT) on UE questions on the Orthopaedic In-Training Examination (OITE). We aimed to compare the performance of ChatGPT to the examination performance of hand surgery residents. METHODS: We selected questions from the 2020-2022 OITEs that focused on both the hand and UE as well as the shoulder and elbow content domains. These questions were divided into two categories: those with text-only prompts (text-only questions) and those that included supplementary images or videos (media questions). Two authors (B.K.F. and G.S.M.) converted the accompanying media into text-based descriptions. Included questions were inputted into ChatGPT (version 3.5) to generate responses. Each OITE question was entered into ChatGPT three times: (1) open-ended response, which requested a free-text response; (2) multiple-choice responses without asking for justification; and (3) multiple-choice response with justification. We referred to the OITE scoring guide for each year in order to compare the percentage of correct AI responses to correct resident responses. RESULTS: A total of 102 UE OITE questions were included; 59 were text-only questions, and 43 were media-based. ChatGPT correctly answered 46 (45%) of 102 questions using the Multiple Choice No Justification prompt requirement (42% for text-based and 44% for media questions). Compared to ChatGPT, postgraduate year 1 orthopaedic residents achieved an average score of 51% correct. Postgraduate year 5 residents answered 76% of the same questions correctly. CONCLUSIONS: ChatGPT answered fewer UE OITE questions correctly compared to hand surgery residents of all training levels. CLINICAL RELEVANCE: Further development of novel AI tools may be necessary if this technology is going to have a role in UE education.

目的：目前，针对人工智能（AI）在上肢（UE）外科教育中的应用进行的前期调查和研究较少。本研究的目的是评估一种新型人工智能工具（ChatGPT）在骨科住院医师培训考试（OITE）中关于上肢问题的表现。我们旨在将ChatGPT的表现与手外科住院医师的考试表现进行比较。方法：我们从2020 - 2022年的OITE中选取了聚焦于手部和上肢以及肩部和肘部内容领域的问题。这些问题分为两类：仅带有文本提示的问题（纯文本问题）和包含补充图像或视频的问题（多媒体问题）。两位作者（B.K.F.和G.S.M.）将附带的多媒体内容转换为基于文本的描述。将纳入的问题输入到ChatGPT（3.5版本）中以生成回答。每个OITE问题在ChatGPT中输入三次：（1）开放式回答，要求自由文本回答；（2）不要求说明理由的多项选择题回答；（3）要求说明理由的多项选择题回答。我们参考每年的OITE评分指南，以便比较人工智能正确回答的百分比与住院医师正确回答的百分比。结果：总共纳入了102个上肢OITE问题；59个是纯文本问题，43个是基于多媒体的问题。ChatGPT使用“不要求说明理由的多项选择题”提示要求正确回答了102个问题中的46个（45%）（基于文本的问题为42%，基于多媒体的问题为44%）。与ChatGPT相比，一年级骨科住院医师的平均正确得分率为51%。五年级住院医师正确回答了相同问题的76%。结论：与所有培训水平的手外科住院医师相比，ChatGPT正确回答的上肢OITE问题较少。临床意义：如果这项技术要在上肢教育中发挥作用，可能需要进一步开发新型人工智能工具。

新学期，新优惠

Suppr 超能文献

新学期，新优惠

Suppr 超能文献

Comparison of Artificial Intelligence to Resident Performance on Upper-Extremity Orthopaedic In-Training Examination Questions.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

推荐工具