Yao Lihua
Office of People Analytics, Defense Personnel Assessment Center, Defense Human Resource Activity, United States Department of Defense, Seaside, CA, United States.
Front Psychol. 2019 Mar 5;10:240. doi: 10.3389/fpsyg.2019.00240. eCollection 2019.
Computer adaptive testing (CAT) has been shown to shorten the test length and increase the precision of latent trait estimates. Oftentimes, test takers are asked to respond to several items that are related to the same passage. The purpose of this study is to explore three CAT item selection techniques for items of the same passages and to provide recommendations and guidance for item selection methods that yield better latent trait estimates. Using simulation, the study compared three models in CAT item selection with passages: (a) the testlet-effect model (T); (b) the passage model (P); and (c) the unidimensional IRT model (U). For the T model, the bifactor model with testlet-effect or constrained multidimensional IRT model was applied. For each of the three models, three procedures were applied: (a) no item exposure control; (b) item exposure control of rate 0.2 ; and (c) item exposure control of rate 1. It was found that the testlet-effect model performed better than passage or unidimensional models. The P and U models tended to overestimate the precision of the theta or latent trait estimates.
计算机自适应测试(CAT)已被证明可以缩短测试长度并提高潜在特质估计的精度。通常,应试者会被要求回答与同一篇文章相关的几个题目。本研究的目的是探索针对同一篇文章题目的三种CAT题目选择技术,并为能产生更好潜在特质估计的题目选择方法提供建议和指导。通过模拟,该研究比较了CAT题目选择中针对文章的三种模型:(a)题组效应模型(T);(b)文章模型(P);以及(c)单维IRT模型(U)。对于T模型,应用了具有题组效应的双因素模型或受限多维IRT模型。对于这三种模型中的每一种,都应用了三种程序:(a)无题目曝光控制;(b)题目曝光率为0.2的控制;以及(c)题目曝光率为1的控制。研究发现,题组效应模型的表现优于文章模型或单维模型。P模型和U模型往往会高估θ或潜在特质估计的精度。