Ozdag Yagiz, Hayes Daniel S, Makar Gabriel S, Manzar Shahid, Foster Brian K, Shultz Mason J, Klena Joel C, Grandizio Louis C
Department of Orthopaedic Surgery, Geisinger Musculoskeletal Institute, Geisinger Commonwealth School of Medicine, Danville, PA.
J Hand Surg Glob Online. 2023 Dec 11;6(2):164-168. doi: 10.1016/j.jhsg.2023.10.013. eCollection 2024 Mar.
PURPOSE: Currently, there is a paucity of prior investigations and studies examining applications for artificial intelligence (AI) in upper-extremity (UE) surgical education. The purpose of this investigation was to assess the performance of a novel AI tool (ChatGPT) on UE questions on the Orthopaedic In-Training Examination (OITE). We aimed to compare the performance of ChatGPT to the examination performance of hand surgery residents. METHODS: We selected questions from the 2020-2022 OITEs that focused on both the hand and UE as well as the shoulder and elbow content domains. These questions were divided into two categories: those with text-only prompts (text-only questions) and those that included supplementary images or videos (media questions). Two authors (B.K.F. and G.S.M.) converted the accompanying media into text-based descriptions. Included questions were inputted into ChatGPT (version 3.5) to generate responses. Each OITE question was entered into ChatGPT three times: (1) open-ended response, which requested a free-text response; (2) multiple-choice responses without asking for justification; and (3) multiple-choice response with justification. We referred to the OITE scoring guide for each year in order to compare the percentage of correct AI responses to correct resident responses. RESULTS: A total of 102 UE OITE questions were included; 59 were text-only questions, and 43 were media-based. ChatGPT correctly answered 46 (45%) of 102 questions using the Multiple Choice No Justification prompt requirement (42% for text-based and 44% for media questions). Compared to ChatGPT, postgraduate year 1 orthopaedic residents achieved an average score of 51% correct. Postgraduate year 5 residents answered 76% of the same questions correctly. CONCLUSIONS: ChatGPT answered fewer UE OITE questions correctly compared to hand surgery residents of all training levels. CLINICAL RELEVANCE: Further development of novel AI tools may be necessary if this technology is going to have a role in UE education.
目的:目前,针对人工智能(AI)在上肢(UE)外科教育中的应用进行的前期调查和研究较少。本研究的目的是评估一种新型人工智能工具(ChatGPT)在骨科住院医师培训考试(OITE)中关于上肢问题的表现。我们旨在将ChatGPT的表现与手外科住院医师的考试表现进行比较。 方法:我们从2020 - 2022年的OITE中选取了聚焦于手部和上肢以及肩部和肘部内容领域的问题。这些问题分为两类:仅带有文本提示的问题(纯文本问题)和包含补充图像或视频的问题(多媒体问题)。两位作者(B.K.F.和G.S.M.)将附带的多媒体内容转换为基于文本的描述。将纳入的问题输入到ChatGPT(3.5版本)中以生成回答。每个OITE问题在ChatGPT中输入三次:(1)开放式回答,要求自由文本回答;(2)不要求说明理由的多项选择题回答;(3)要求说明理由的多项选择题回答。我们参考每年的OITE评分指南,以便比较人工智能正确回答的百分比与住院医师正确回答的百分比。 结果:总共纳入了102个上肢OITE问题;59个是纯文本问题,43个是基于多媒体的问题。ChatGPT使用“不要求说明理由的多项选择题”提示要求正确回答了102个问题中的46个(45%)(基于文本的问题为42%,基于多媒体的问题为44%)。与ChatGPT相比,一年级骨科住院医师的平均正确得分率为51%。五年级住院医师正确回答了相同问题的76%。 结论:与所有培训水平的手外科住院医师相比,ChatGPT正确回答的上肢OITE问题较少。 临床意义:如果这项技术要在上肢教育中发挥作用,可能需要进一步开发新型人工智能工具。
J Hand Surg Glob Online. 2023-12-11
JB JS Open Access. 2023-9-8
J Med Internet Res. 2024-11-15
JAMA Ophthalmol. 2023-6-1
PLOS Digit Health. 2023-2-9
Comput Math Methods Med. 2022
J Shoulder Elbow Surg. 2022-11
JB JS Open Access. 2022-5-19