Suppr超能文献

ChatGPT用于生成教学用临床案例和评估用多项选择题:一项随机对照实验。

ChatGPT to generate clinical vignettes for teaching and multiple-choice questions for assessment: A randomized controlled experiment.

作者信息

Coşkun Özlem, Kıyak Yavuz Selim, Budakoğlu Işıl İrem

机构信息

Department of Medical Education and Informatics, Gazi University, Ankara, Turkey.

出版信息

Med Teach. 2025 Feb;47(2):268-274. doi: 10.1080/0142159X.2024.2327477. Epub 2024 Mar 13.

Abstract

AIM

This study aimed to evaluate the real-life performance of clinical vignettes and multiple-choice questions generated by using ChatGPT.

METHODS

This was a randomized controlled study in an evidence-based medicine training program. We randomly assigned seventy-four medical students to two groups. The ChatGPT group received ill-defined cases generated by ChatGPT, while the control group received human-written cases. At the end of the training, they evaluated the cases by rating 10 statements using a Likert scale. They also answered 15 multiple-choice questions (MCQs) generated by ChatGPT. The case evaluations of the two groups were compared. Some psychometric characteristics (item difficulty and point-biserial correlations) of the test were also reported.

RESULTS

None of the scores in 10 statements regarding the cases showed a significant difference between the ChatGPT group and the control group ( > .05). In the test, only six MCQs had acceptable levels (higher than 0.30) of point-biserial correlation, and five items could be considered acceptable in classroom settings.

CONCLUSIONS

The results showed that the quality of the vignettes are comparable to those created by human authors, and some multiple-questions have acceptable psychometric characteristics. ChatGPT has potential in generating clinical vignettes for teaching and MCQs for assessment in medical education.

摘要

目的

本研究旨在评估使用ChatGPT生成的临床案例和多项选择题在实际应用中的表现。

方法

这是一项在循证医学培训项目中的随机对照研究。我们将74名医学生随机分为两组。ChatGPT组接收由ChatGPT生成的模糊病例,而对照组接收人工编写的病例。在培训结束时,他们使用李克特量表对10条陈述进行评分来评估这些病例。他们还回答了由ChatGPT生成的15道多项选择题。比较了两组的病例评估结果。还报告了测试的一些心理测量学特征(题目难度和点二列相关)。

结果

关于病例的10条陈述中的分数,ChatGPT组和对照组之间均无显著差异(>0.05)。在测试中,只有6道多项选择题的点二列相关水平可接受(高于0.30),并且在课堂环境中有5道题目可被认为是可接受的。

结论

结果表明,案例的质量与人工编写的相当,并且一些多项选择题具有可接受的心理测量学特征。ChatGPT在为医学教育中的教学生成临床案例和为评估生成多项选择题方面具有潜力。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验