Zheng Aaron, Barker Cole J, Ferrante Sergio S, Squires Judy H, Branstetter Iv Barton F, Hughes Marion A
Department of Radiology, University of Pittsburgh Medical Center, Pittsburgh, PA 15213.
Department of Radiology, University of Pittsburgh Medical Center, Pittsburgh, PA 15213.
Acad Radiol. 2025 Sep;32(9):5635-5642. doi: 10.1016/j.acra.2025.06.019. Epub 2025 Jul 7.
Chat generative pre-trained transformer (ChatGPT) is a generative artificial intelligence chatbot based on a LLM at the forefront of technological development with promising applications in medical education. This study aims to evaluate the use of ChatGPT in generating board-style practice questions for radiology resident education.
Multiple-choice questions (MCQs) were generated by ChatGPT from resident lecture transcripts using a custom prompt. 17 of the ChatGPT-generated MCQs were selected for inclusion in the study and randomly combined with 11 attending radiologist-written MCQs. For each MCQ, the 21 participating radiology residents answered the MCQ, rated the MCQ from 1-10 on effectiveness in reinforcing lecture material, and responded whether they thought an attending radiologist at their institution wrote the MCQ versus an alternative source.
Perceived MCQ quality was not significantly different between ChatGPT-generated (M=6.93, SD=0.29) and attending radiologist-written MCQs (M=7.08, SD=0.51) (p=0.15). MCQ correct answer percentages did not significantly differ between ChatGPT-generated (M=57%, SD=20%) and attending radiologist-written MCQs (M=59%, SD=25%) (p=0.78). The percentage of MCQs thought to be written by an attending radiologist was significantly different between ChatGPT-generated (M=57%, SD=13%) and attending radiologist-written MCQs (M=71%, SD=20%) (p=0.04).
LLMs such as ChatGPT demonstrate potential in generating and presenting educational material for radiology education, and their use should be explored further on a larger scale.
聊天生成预训练变换器(ChatGPT)是一种基于大语言模型的生成式人工智能聊天机器人,处于技术发展前沿,在医学教育中有广阔应用前景。本研究旨在评估ChatGPT在生成用于放射科住院医师教育的板题型练习题方面的应用。
ChatGPT使用自定义提示词,根据住院医师讲座记录生成多项选择题(MCQ)。从ChatGPT生成的MCQ中选取17道纳入研究,并与11道放射科主治医生编写的MCQ随机组合。对于每道MCQ,21名参与研究的放射科住院医师回答问题,从1到10对该MCQ强化讲座内容的有效性进行评分,并回答他们认为该MCQ是由所在机构的放射科主治医生编写还是由其他来源编写。
ChatGPT生成的MCQ(M=6.93,标准差=0.29)和放射科主治医生编写的MCQ(M=7.08,标准差=0.51)在感知到的MCQ质量上无显著差异(p=0.15)。ChatGPT生成的MCQ(M=57%,标准差=20%)和放射科主治医生编写的MCQ(M=59%,标准差=25%)在MCQ正确答案百分比上无显著差异(p=0.78)。ChatGPT生成的MCQ(M=57%,标准差=13%)和放射科主治医生编写的MCQ(M=71%,标准差=20%)在被认为由放射科主治医生编写的MCQ百分比上有显著差异(p=0.04)。
ChatGPT等大语言模型在为放射科教育生成和呈现教育材料方面显示出潜力,应在更大规模上进一步探索其应用。