Hsu Chia-Ling, Wang Wen-Chung
The Education University of Hong Kong, Tai Po, Hong Kong.
Appl Psychol Meas. 2019 Sep;43(6):464-480. doi: 10.1177/0146621618800280. Epub 2018 Oct 26.
Current use of multidimensional computerized adaptive testing (MCAT) has been developed in conjunction with compensatory multidimensional item response theory (MIRT) models rather than with non-compensatory ones. In recognition of the usefulness of MCAT and the complications associated with non-compensatory data, this study aimed to develop MCAT algorithms using non-compensatory MIRT models and to evaluate their performance. For the purpose of the study, three item selection methods were adapted and compared, namely, the Fisher information method, the mutual information method, and the Kullback-Leibler information method. The results of a series of simulations showed that the Fisher information and mutual information methods performed similarly, and both outperformed the Kullback-Leibler information method. In addition, it was found that the more stringent the termination criterion and the higher the correlation between the latent traits, the higher the resulting measurement precision and test reliability. Test reliability was very similar across the dimensions, regardless of the correlation between the latent traits and termination criterion. On average, the difficulties of the administered items were found to be at a lower level than the examinees' abilities, which shed light on item bank construction for non-compensatory items.
当前多维计算机自适应测试(MCAT)的应用是与补偿性多维项目反应理论(MIRT)模型相结合开发的,而非与非补偿性模型相结合。鉴于MCAT的实用性以及与非补偿性数据相关的复杂性,本研究旨在使用非补偿性MIRT模型开发MCAT算法并评估其性能。为了该研究的目的,采用并比较了三种项目选择方法,即费舍尔信息法、互信息法和库尔贝克-莱布勒信息法。一系列模拟结果表明,费舍尔信息法和互信息法表现相似,且均优于库尔贝克-莱布勒信息法。此外,研究发现终止标准越严格,潜在特质之间的相关性越高,测量精度和测试信度就越高。无论潜在特质与终止标准之间的相关性如何,各维度的测试信度都非常相似。平均而言,所施测项目的难度低于考生的能力水平,这为非补偿性项目的题库建设提供了启示。