Kıyak Yavuz Selim, Emekli Emre, Coşkun Özlem, Budakoğlu Işıl İrem
Department of Medical Education and Informatics, Faculty of Medicine, Gazi University, Ankara, Turkey.
Department of Bioinformatics and Telemedicine, Jagiellonian University Medical College, Kraków, Poland.
Med Teach. 2025 Apr;47(4):744-747. doi: 10.1080/0142159X.2024.2430360. Epub 2024 Nov 27.
Manually creating multiple-choice questions (MCQ) is inefficient. Automatic item generation (AIG) offers a scalable solution, with two main approaches: template-based and non-template-based (AI-driven). Template-based AIG ensures accuracy but requires significant expert input to develop templates. In contrast, AI-driven AIG can generate questions quickly but with inaccuracies. The Hybrid AIG combines the strengths of both methods. However, neither have MCQs been generated using the Hybrid AIG approach nor has any validity evidence been provided.
We generated MCQs using the Hybrid AIG approach and investigated the validity evidence of these questions by determining whether experts could identify the correct answers. We used a custom ChatGPT to develop an item template, which were then fed into Gazitor, a template-based AIG (non-AI) software. A panel of medical doctors identified the answers.
Of 105 decisions, 101 (96.2%) matched the software's correct answer. In all MCQs (100%), the experts reached a consensus on the correct answer. The evidence corresponds to the 'Relations to Other Variables' in Messick's validity framework.
The Hybrid AIG approach can enhance the efficiency of MCQ generation while maintaining accuracy. It mitigates concerns about hallucinations while benefiting from AI.
手动创建多项选择题(MCQ)效率低下。自动试题生成(AIG)提供了一种可扩展的解决方案,有两种主要方法:基于模板的和非基于模板的(人工智能驱动)。基于模板的AIG可确保准确性,但需要专家大量投入来开发模板。相比之下,人工智能驱动的AIG可以快速生成问题,但存在不准确之处。混合AIG结合了两种方法的优点。然而,既没有使用混合AIG方法生成MCQ,也没有提供任何有效性证据。
我们使用混合AIG方法生成MCQ,并通过确定专家是否能够识别正确答案来研究这些问题的有效性证据。我们使用定制的ChatGPT开发了一个试题模板,然后将其输入到基于模板的AIG(非人工智能)软件Gazitor中。一组医学博士确定了答案。
在105个判断中,101个(96.2%)与软件的正确答案相符。在所有MCQ中(100%),专家们就正确答案达成了共识。该证据与梅西克有效性框架中的“与其他变量的关系”相对应。
混合AIG方法可以提高MCQ生成的效率,同时保持准确性。它减轻了对幻觉的担忧,同时受益于人工智能。