Rios Joseph A, Soland James
University of Minnesota, Minneapolis, MN, USA.
University of Virginia, Charlottesville, VA, USA.
Educ Psychol Meas. 2021 Jun;81(3):569-594. doi: 10.1177/0013164420949896. Epub 2020 Sep 2.
As low-stakes testing contexts increase, low test-taking effort may serve as a serious validity threat. One common solution to this problem is to identify noneffortful responses and treat them as missing during parameter estimation via the effort-moderated item response theory (EM-IRT) model. Although this model has been shown to outperform traditional IRT models (e.g., two-parameter logistic [2PL]) in parameter estimation under simulated conditions, prior research has failed to examine its performance under violations to the model's assumptions. Therefore, the objective of this simulation study was to examine item and mean ability parameter recovery when violating the assumptions that noneffortful responding occurs randomly (Assumption 1) and is unrelated to the underlying ability of examinees (Assumption 2). Results demonstrated that, across conditions, the EM-IRT model provided robust item parameter estimates to violations of Assumption 1. However, bias values greater than 0.20 were observed for the EM-IRT model when violating Assumption 2; nonetheless, these values were still lower than the 2PL model. In terms of mean ability estimates, model results indicated equal performance between the EM-IRT and 2PL models across conditions. Across both models, mean ability estimates were found to be biased by more than 0.25 when violating Assumption 2. However, our accompanying empirical study suggested that this biasing occurred under extreme conditions that may not be present in some operational settings. Overall, these results suggest that the EM-IRT model provides superior item and equal mean ability parameter estimates in the presence of model violations under realistic conditions when compared with the 2PL model.
随着低风险测试情境的增加,低测试投入可能成为严重的效度威胁。解决这个问题的一个常见方法是识别非投入性回答,并在通过投入调节项目反应理论(EM-IRT)模型进行参数估计时将其视为缺失值。尽管该模型在模拟条件下的参数估计中已被证明优于传统的IRT模型(如两参数逻辑斯蒂模型[2PL]),但先前的研究未能考察其在违反模型假设情况下的表现。因此,本模拟研究的目的是在违反非投入性回答随机出现(假设1)且与考生潜在能力无关(假设2)的假设时,检验项目和平均能力参数的恢复情况。结果表明,在各种条件下,EM-IRT模型对违反假设1的情况能提供稳健的项目参数估计。然而,当违反假设2时,EM-IRT模型观察到偏差值大于0.20;尽管如此,这些值仍低于2PL模型。就平均能力估计而言,模型结果表明在各种条件下EM-IRT模型和2PL模型的表现相当。在两个模型中,当违反假设2时,发现平均能力估计的偏差超过0.25。然而,我们附带的实证研究表明,这种偏差发生在某些实际操作环境中可能不存在的极端条件下。总体而言,这些结果表明,与2PL模型相比,在现实条件下存在模型违反的情况下,EM-IRT模型能提供更优的项目参数估计和相当的平均能力参数估计。