Sommer Markus, Arendasy Martin
Department of Psychology, University of Graz, Universitätsplatz 2, 8010 Graz, Austria.
J Intell. 2025 Aug 12;13(8):102. doi: 10.3390/jintelligence13080102.
This article provides a critical review of conceptually different approaches to automatic and transformer-based automatic item generation. Based on a discussion of the current challenges that have arisen due to changes in the use of psychometric tests in recent decades, we outline the requirements that these approaches should ideally fulfill. Subsequently, each approach is examined individually to determine the extent to which it can contribute to meeting the challenges. In doing so, we will focus on the cost savings during the actual item construction phase, the extent to which they may contribute to enhancing test validity, and potential cost savings in the item calibration phase due to either a reduction in the sample size required for item calibration or a reduction in the item loss due to insufficient psychometric characteristics. In addition, the article also aims to outline common recurring themes across these conceptually different approaches and outline areas within each approach that warrant further scientific research.
本文对基于自动和基于Transformer的自动试题生成的概念上不同的方法进行了批判性综述。基于对近几十年来心理测量测试使用变化所引发的当前挑战的讨论,我们概述了这些方法理想情况下应满足的要求。随后,对每种方法进行单独审查,以确定其在多大程度上有助于应对这些挑战。在此过程中,我们将关注实际试题构建阶段的成本节约、它们在提高测试效度方面的贡献程度,以及由于减少试题校准所需的样本量或因心理测量特征不足导致的试题损失减少而在试题校准阶段可能实现的潜在成本节约。此外,本文还旨在概述这些概念上不同的方法中常见的反复出现的主题,并概述每种方法中值得进一步科学研究的领域。