ChatGPT用于生成教学用临床案例和评估用多项选择题：一项随机对照实验。

ChatGPT to generate clinical vignettes for teaching and multiple-choice questions for assessment: A randomized controlled experiment.

作者信息

Coşkun Özlem, Kıyak Yavuz Selim, Budakoğlu Işıl İrem

机构信息

Department of Medical Education and Informatics, Gazi University, Ankara, Turkey.

出版信息

Med Teach. 2025 Feb;47(2):268-274. doi: 10.1080/0142159X.2024.2327477. Epub 2024 Mar 13.

DOI:10.1080/0142159X.2024.2327477

PMID:38478902

Abstract

AIM

This study aimed to evaluate the real-life performance of clinical vignettes and multiple-choice questions generated by using ChatGPT.

METHODS

This was a randomized controlled study in an evidence-based medicine training program. We randomly assigned seventy-four medical students to two groups. The ChatGPT group received ill-defined cases generated by ChatGPT, while the control group received human-written cases. At the end of the training, they evaluated the cases by rating 10 statements using a Likert scale. They also answered 15 multiple-choice questions (MCQs) generated by ChatGPT. The case evaluations of the two groups were compared. Some psychometric characteristics (item difficulty and point-biserial correlations) of the test were also reported.

RESULTS

None of the scores in 10 statements regarding the cases showed a significant difference between the ChatGPT group and the control group ( > .05). In the test, only six MCQs had acceptable levels (higher than 0.30) of point-biserial correlation, and five items could be considered acceptable in classroom settings.

CONCLUSIONS

The results showed that the quality of the vignettes are comparable to those created by human authors, and some multiple-questions have acceptable psychometric characteristics. ChatGPT has potential in generating clinical vignettes for teaching and MCQs for assessment in medical education.

摘要

目的

本研究旨在评估使用ChatGPT生成的临床案例和多项选择题在实际应用中的表现。

方法

这是一项在循证医学培训项目中的随机对照研究。我们将74名医学生随机分为两组。ChatGPT组接收由ChatGPT生成的模糊病例，而对照组接收人工编写的病例。在培训结束时，他们使用李克特量表对10条陈述进行评分来评估这些病例。他们还回答了由ChatGPT生成的15道多项选择题。比较了两组的病例评估结果。还报告了测试的一些心理测量学特征（题目难度和点二列相关）。

结果

关于病例的10条陈述中的分数，ChatGPT组和对照组之间均无显著差异（>0.05）。在测试中，只有6道多项选择题的点二列相关水平可接受（高于0.30），并且在课堂环境中有5道题目可被认为是可接受的。

结论

结果表明，案例的质量与人工编写的相当，并且一些多项选择题具有可接受的心理测量学特征。ChatGPT在为医学教育中的教学生成临床案例和为评估生成多项选择题方面具有潜力。

相似文献

ChatGPT to generate clinical vignettes for teaching and multiple-choice questions for assessment: A randomized controlled experiment.ChatGPT用于生成教学用临床案例和评估用多项选择题：一项随机对照实验。

Med Teach. 2025 Feb;47(2):268-274. doi: 10.1080/0142159X.2024.2327477. Epub 2024 Mar 13.

Can ChatGPT Generate Acceptable Case-Based Multiple-Choice Questions for Medical School Anatomy Exams? A Pilot Study on Item Difficulty and Discrimination.ChatGPT能否生成适用于医学院解剖学考试的基于案例的多项选择题？关于题目难度和区分度的初步研究。

Clin Anat. 2025 May;38(4):505-510. doi: 10.1002/ca.24271. Epub 2025 Mar 24.

Integrating ChatGPT in Orthopedic Education for Medical Undergraduates: Randomized Controlled Trial.将 ChatGPT 融入骨科医学本科生教育：随机对照试验。

J Med Internet Res. 2024 Aug 20;26:e57037. doi: 10.2196/57037.

Can ChatGPT generate practice question explanations for medical students, a new faculty teaching tool?ChatGPT能否为医学生生成练习题解释，成为一种新的教师教学工具？

Med Teach. 2025 Mar;47(3):560-564. doi: 10.1080/0142159X.2024.2363486. Epub 2024 Jun 20.

ChatGPT versus expert feedback on clinical reasoning questions and their effect on learning: a randomized controlled trial.ChatGPT与专家反馈对临床推理问题的影响及其对学习的作用：一项随机对照试验

Postgrad Med J. 2025 Apr 22;101(1195):458-463. doi: 10.1093/postmj/qgae170.

Artificial Intelligence as a Discriminator of Competence in Urological Training: Are We There?人工智能作为泌尿外科培训中能力的鉴别器：我们做到了吗？

J Urol. 2025 Apr;213(4):504-511. doi: 10.1097/JU.0000000000004357. Epub 2024 Dec 9.

ChatGPT for generating multiple-choice questions: Evidence on the use of artificial intelligence in automatic item generation for a rational pharmacotherapy exam.ChatGPT 生成选择题：人工智能在合理药物治疗考试自动试题生成中的应用证据。

Eur J Clin Pharmacol. 2024 May;80(5):729-735. doi: 10.1007/s00228-024-03649-x. Epub 2024 Feb 14.

Using large language models (ChatGPT, Copilot, PaLM, Bard, and Gemini) in Gross Anatomy course: Comparative analysis.在大体解剖学课程中使用大语言模型（ChatGPT、Copilot、PaLM、Bard和Gemini）：比较分析

Clin Anat. 2025 Mar;38(2):200-210. doi: 10.1002/ca.24244. Epub 2024 Nov 21.

AI versus human-generated multiple-choice questions for medical education: a cohort study in a high-stakes examination.用于医学教育的人工智能生成与人工生成的多项选择题：一项在高风险考试中的队列研究

BMC Med Educ. 2025 Feb 8;25(1):208. doi: 10.1186/s12909-025-06796-6.

Large Language Models in Medical Education: Comparing ChatGPT- to Human-Generated Exam Questions.大语言模型在医学教育中的应用：比较 ChatGPT 与人工生成的考试题目。

Acad Med. 2024 May 1;99(5):508-512. doi: 10.1097/ACM.0000000000005626. Epub 2023 Dec 28.

引用本文的文献

Attitudes and usage of ChatGPT among pharmacy students in a Sub-Saharan African country, Zambia: findings and implications on the education system.撒哈拉以南非洲国家赞比亚药学专业学生对ChatGPT的态度及使用情况：研究结果及其对教育系统的启示

BMC Med Educ. 2025 Sep 1;25(1):1237. doi: 10.1186/s12909-025-07833-0.

Evaluation of Multiple-Choice Tests in Head and Neck Ultrasound Created by Physicians and Large Language Models.医生和大语言模型创建的头颈部超声选择题测试评估

Diagnostics (Basel). 2025 Jul 22;15(15):1848. doi: 10.3390/diagnostics15151848.

The application of artificial intelligence-generated content in ophthalmology education.人工智能生成内容在眼科教育中的应用。

Front Med (Lausanne). 2025 Jul 18;12:1617537. doi: 10.3389/fmed.2025.1617537. eCollection 2025.

Comparison of AI-generated and clinician-designed multiple-choice questions in emergency medicine exam: a psychometric analysis.急诊医学考试中人工智能生成与临床医生设计的多项选择题比较：一项心理测量学分析

BMC Med Educ. 2025 Jul 1;25(1):949. doi: 10.1186/s12909-025-07528-6.

The role of generative artificial intelligence in psychiatric education- a scoping review.生成式人工智能在精神科教育中的作用——一项范围综述

BMC Med Educ. 2025 Mar 25;25(1):438. doi: 10.1186/s12909-025-07026-9.

Virtual Patient Simulations Using Social Robotics Combined With Large Language Models for Clinical Reasoning Training in Medical Education: Mixed Methods Study.使用社交机器人结合大语言模型进行医学教育临床推理训练的虚拟患者模拟：混合方法研究

J Med Internet Res. 2025 Mar 3;27:e63312. doi: 10.2196/63312.

Quality assurance and validity of AI-generated single best answer questions.人工智能生成的最佳单一答案问题的质量保证与有效性

BMC Med Educ. 2025 Feb 25;25(1):300. doi: 10.1186/s12909-025-06881-w.

Visual-textual integration in LLMs for medical diagnosis: A preliminary quantitative analysis.用于医学诊断的语言模型中的视觉文本整合：初步定量分析

Comput Struct Biotechnol J. 2024 Dec 22;27:184-189. doi: 10.1016/j.csbj.2024.12.019. eCollection 2025.

Beginner-Level Tips for Medical Educators: Guidance on Selection, Prompt Engineering, and the Use of Artificial Intelligence Chatbots.医学教育工作者的初级水平提示：关于选择、提示工程以及人工智能聊天机器人使用的指南。

Med Sci Educ. 2024 Aug 17;34(6):1571-1576. doi: 10.1007/s40670-024-02146-1. eCollection 2024 Dec.

Can ChatGPT generate surgical multiple-choice questions comparable to those written by a surgeon?ChatGPT能否生成与外科医生编写的相媲美的外科选择题？

Proc (Bayl Univ Med Cent). 2024 Oct 22;38(1):48-52. doi: 10.1080/08998280.2024.2418752. eCollection 2025.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

ChatGPT用于生成教学用临床案例和评估用多项选择题：一项随机对照实验。

ChatGPT to generate clinical vignettes for teaching and multiple-choice questions for assessment: A randomized controlled experiment.

作者信息

机构信息

出版信息

AIM

METHODS

RESULTS

CONCLUSIONS

目的

方法

结果

结论

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献