Suppr超能文献

ChatGPT 生成选择题:人工智能在合理药物治疗考试自动试题生成中的应用证据。

ChatGPT for generating multiple-choice questions: Evidence on the use of artificial intelligence in automatic item generation for a rational pharmacotherapy exam.

机构信息

Department of Medical Education and Informatics, Faculty of Medicine, Gazi University, Ankara, Turkey.

Gazi Üniversitesi Hastanesi E Blok 9, Kat 06500 Beşevler, Ankara, Turkey.

出版信息

Eur J Clin Pharmacol. 2024 May;80(5):729-735. doi: 10.1007/s00228-024-03649-x. Epub 2024 Feb 14.

Abstract

PURPOSE

Artificial intelligence, specifically large language models such as ChatGPT, offers valuable potential benefits in question (item) writing. This study aimed to determine the feasibility of generating case-based multiple-choice questions using ChatGPT in terms of item difficulty and discrimination levels.

METHODS

This study involved 99 fourth-year medical students who participated in a rational pharmacotherapy clerkship carried out based-on the WHO 6-Step Model. In response to a prompt that we provided, ChatGPT generated ten case-based multiple-choice questions on hypertension. Following an expert panel, two of these multiple-choice questions were incorporated into a medical school exam without making any changes in the questions. Based on the administration of the test, we evaluated their psychometric properties, including item difficulty, item discrimination (point-biserial correlation), and functionality of the options.

RESULTS

Both questions exhibited acceptable levels of point-biserial correlation, which is higher than the threshold of 0.30 (0.41 and 0.39). However, one question had three non-functional options (options chosen by fewer than 5% of the exam participants) while the other question had none.

CONCLUSIONS

The findings showed that the questions can effectively differentiate between students who perform at high and low levels, which also point out the potential of ChatGPT as an artificial intelligence tool in test development. Future studies may use the prompt to generate items in order for enhancing the external validity of the results by gathering data from diverse institutions and settings.

摘要

目的

人工智能,特别是大型语言模型(如 ChatGPT),在问题(项目)写作方面具有有价值的潜在优势。本研究旨在确定使用 ChatGPT 生成基于案例的多项选择题在项目难度和区分度方面的可行性。

方法

本研究涉及 99 名四年级医学生,他们参加了基于世界卫生组织 6 步模型的合理药物治疗实习。ChatGPT 根据我们提供的提示生成了十个基于案例的高血压多项选择题。在专家小组的指导下,其中两个多项选择题被纳入医学院考试,而问题本身没有任何改动。根据测试的实施情况,我们评估了它们的心理测量学特性,包括项目难度、项目区分度(点二项相关)和选项的功能。

结果

两个问题的点二项相关都达到了可接受的水平,高于 0.30 的阈值(分别为 0.41 和 0.39)。然而,一个问题有三个非功能选项(选择这些选项的考生不到 5%),而另一个问题则没有。

结论

研究结果表明,这些问题可以有效区分高水平和低水平的学生,也指出了 ChatGPT 作为人工智能工具在测试开发中的潜力。未来的研究可以使用提示来生成项目,以便通过从不同机构和环境中收集数据来提高结果的外部有效性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验