Department of Human Development and Quantitative Methodology, University of Maryland, College Park, USA.
Department of Psychology, University of South Carolina, Columbia, USA.
Psychometrika. 2019 Jun;84(2):529-553. doi: 10.1007/s11336-019-09667-4. Epub 2019 Mar 20.
In item response theory (IRT), it is often necessary to perform restricted recalibration (RR) of the model: A set of (focal) parameters is estimated holding a set of (nuisance) parameters fixed. Typical applications of RR include expanding an existing item bank, linking multiple test forms, and associating constructs measured by separately calibrated tests. In the current work, we provide full statistical theory for RR of IRT models under the framework of pseudo-maximum likelihood estimation. We describe the standard error calculation for the focal parameters, the assessment of overall goodness-of-fit (GOF) of the model, and the identification of misfitting items. We report a simulation study to evaluate the performance of these methods in the scenario of adding a new item to an existing test. Parameter recovery for the focal parameters as well as Type I error and power of the proposed tests are examined. An empirical example is also included, in which we validate the pediatric fatigue short-form scale in the Patient-Reported Outcome Measurement Information System (PROMIS), compute global and local GOF statistics, and update parameters for the misfitting items.
在项目反应理论 (IRT) 中,通常需要对模型进行受限再校准 (RR):固定一组 (干扰) 参数,估计一组 (焦点) 参数。RR 的典型应用包括扩展现有的项目库、链接多个测试表单,以及关联由单独校准测试测量的结构。在当前的工作中,我们在拟最大似然估计框架下为 IRT 模型的 RR 提供了完整的统计理论。我们描述了焦点参数的标准误差计算、模型的整体拟合优度 (GOF) 评估以及不匹配项目的识别。我们报告了一项模拟研究,以评估这些方法在向现有测试中添加新项目的情况下的性能。还检查了焦点参数的参数恢复、拟议测试的Ⅰ类错误和功效。还包括一个实证示例,我们验证了患者报告的结局测量信息系统 (PROMIS) 中的儿科疲劳简短量表,计算了全球和局部 GOF 统计数据,并更新了不匹配项目的参数。