ChatGPT能否生成适用于医学院解剖学考试的基于案例的多项选择题？关于题目难度和区分度的初步研究。

Can ChatGPT Generate Acceptable Case-Based Multiple-Choice Questions for Medical School Anatomy Exams? A Pilot Study on Item Difficulty and Discrimination.

作者信息

Kıyak Yavuz Selim, Soylu Ayşe, Coşkun Özlem, Budakoğlu Işıl İrem, Peker Tuncay Veysel

机构信息

Department of Medical Education and Informatics, Faculty of Medicine, Gazi University, Ankara, Turkey.

Department of Anatomy, Faculty of Medicine, Gazi University, Ankara, Turkey.

出版信息

Clin Anat. 2025 May;38(4):505-510. doi: 10.1002/ca.24271. Epub 2025 Mar 24.

DOI:10.1002/ca.24271

PMID:40129054

Abstract

Developing high-quality multiple-choice questions (MCQs) for medical school exams is effortful and time-consuming. In this study, we investigated the ability of ChatGPT to generate case-based anatomy MCQs with acceptable levels of item difficulty and discrimination for medical school exams. We used ChatGPT to generate case-based anatomy MCQs for an endocrine and urogenital system exam based on a framework for artificial intelligence (AI)-assisted item generation. The questions were evaluated by experts, approved by the department, and administered to 502 second-year medical students (372 Turkish-language, 130 English-language). The items were analyzed to determine the discrimination and difficulty indices. The item discrimination indices ranged from 0.29 to 0.54, indicating acceptable differentiation between high- and low-performing students. All items in Turkish (six out of six) and five out of six in English met the higher discrimination threshold (≥ 0.30) required for large-scale standardized tests. The item difficulty indices ranged from 0.41 to 0.89, most items falling within the moderate difficulty range (0.20-0.80). Therefore, it was concluded that ChatGPT can generate case-based anatomy MCQs with acceptable psychometric properties, offering a promising tool for medical educators. However, human expertise remains crucial for reviewing and refining AI-generated assessment items. Future research should explore AI-generated MCQs across various anatomy topics and investigate different AI models for question generation.

摘要

为医学院考试编写高质量的多项选择题既费力又耗时。在本研究中，我们调查了ChatGPT生成基于案例的解剖学多项选择题的能力，这些题目对于医学院考试而言，在题目难度和区分度方面达到了可接受的水平。我们基于人工智能辅助题目生成框架，使用ChatGPT为一场内分泌和泌尿生殖系统考试生成基于案例的解剖学多项选择题。这些题目由专家进行评估，经系里批准后，施测于502名二年级医学生（372名使用土耳其语，130名使用英语）。对这些题目进行分析以确定区分度和难度指数。题目区分度指数在0.29至0.54之间，表明成绩高和低的学生之间有可接受的区分度。所有土耳其语题目（6题中的6题）和6题中的5题英语题目达到了大规模标准化考试所需的更高区分度阈值（≥0.30）。题目难度指数在0.41至0.89之间，大多数题目落在中等难度范围内（0.20 - 0.80）。因此，得出的结论是，ChatGPT能够生成具有可接受心理测量特性的基于案例的解剖学多项选择题，为医学教育工作者提供了一个有前景的工具。然而，人工专业知识对于审查和完善人工智能生成的评估题目仍然至关重要。未来的研究应探索跨各种解剖学主题的人工智能生成的多项选择题，并研究用于题目生成的不同人工智能模型。

相似文献

Can ChatGPT Generate Acceptable Case-Based Multiple-Choice Questions for Medical School Anatomy Exams? A Pilot Study on Item Difficulty and Discrimination.

Clin Anat. 2025 May;38(4):505-510. doi: 10.1002/ca.24271. Epub 2025 Mar 24.

AI in radiography education: Evaluating multiple-choice questions difficulty and discrimination.

J Med Imaging Radiat Sci. 2025 Mar 28;56(4):101896. doi: 10.1016/j.jmir.2025.101896.

AI versus human-generated multiple-choice questions for medical education: a cohort study in a high-stakes examination.

BMC Med Educ. 2025 Feb 8;25(1):208. doi: 10.1186/s12909-025-06796-6.

Using large language models (ChatGPT, Copilot, PaLM, Bard, and Gemini) in Gross Anatomy course: Comparative analysis.

Clin Anat. 2025 Mar;38(2):200-210. doi: 10.1002/ca.24244. Epub 2024 Nov 21.

ChatGPT to generate clinical vignettes for teaching and multiple-choice questions for assessment: A randomized controlled experiment.

Med Teach. 2025 Feb;47(2):268-274. doi: 10.1080/0142159X.2024.2327477. Epub 2024 Mar 13.

ChatGPT for generating multiple-choice questions: Evidence on the use of artificial intelligence in automatic item generation for a rational pharmacotherapy exam.

Eur J Clin Pharmacol. 2024 May;80(5):729-735. doi: 10.1007/s00228-024-03649-x. Epub 2024 Feb 14.

Integrating ChatGPT in Orthopedic Education for Medical Undergraduates: Randomized Controlled Trial.

J Med Internet Res. 2024 Aug 20;26:e57037. doi: 10.2196/57037.

Artificial Intelligence as a Discriminator of Competence in Urological Training: Are We There?

J Urol. 2025 Apr;213(4):504-511. doi: 10.1097/JU.0000000000004357. Epub 2024 Dec 9.

Using Automatic Item Generation to Improve the Quality of MCQ Distractors.

Teach Learn Med. 2016;28(2):166-73. doi: 10.1080/10401334.2016.1146608.

Large Language Models in Medical Education: Comparing ChatGPT- to Human-Generated Exam Questions.

Acad Med. 2024 May 1;99(5):508-512. doi: 10.1097/ACM.0000000000005626. Epub 2023 Dec 28.

引用本文的文献

Assessing LLM-generated vs. expert-created clinical anatomy MCQs: a student perception-based comparative study in medical education.

Med Educ Online. 2025 Dec;30(1):2554678. doi: 10.1080/10872981.2025.2554678. Epub 2025 Aug 30.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

ChatGPT能否生成适用于医学院解剖学考试的基于案例的多项选择题？关于题目难度和区分度的初步研究。

Can ChatGPT Generate Acceptable Case-Based Multiple-Choice Questions for Medical School Anatomy Exams? A Pilot Study on Item Difficulty and Discrimination.

作者信息

Kıyak Yavuz Selim, Soylu Ayşe, Coşkun Özlem, Budakoğlu Işıl İrem, Peker Tuncay Veysel

机构信息

Department of Medical Education and Informatics, Faculty of Medicine, Gazi University, Ankara, Turkey.

Department of Anatomy, Faculty of Medicine, Gazi University, Ankara, Turkey.

出版信息

Clin Anat. 2025 May;38(4):505-510. doi: 10.1002/ca.24271. Epub 2025 Mar 24.

DOI:10.1002/ca.24271

PMID:40129054

Abstract

摘要

ChatGPT能否生成适用于医学院解剖学考试的基于案例的多项选择题？关于题目难度和区分度的初步研究。

Can ChatGPT Generate Acceptable Case-Based Multiple-Choice Questions for Medical School Anatomy Exams? A Pilot Study on Item Difficulty and Discrimination.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

ChatGPT能否生成适用于医学院解剖学考试的基于案例的多项选择题？关于题目难度和区分度的初步研究。

Can ChatGPT Generate Acceptable Case-Based Multiple-Choice Questions for Medical School Anatomy Exams? A Pilot Study on Item Difficulty and Discrimination.

作者信息

机构信息

出版信息

相似文献

引用本文的文献