Liu Yang, Maydeu-Olivares Alberto
a The University of North Carolina at Chapel Hill.
b Faculty of Psychology, University of Barcelona.
Multivariate Behav Res. 2014 Jul-Aug;49(4):354-71. doi: 10.1080/00273171.2014.910744.
When an item response theory model fails to fit adequately, the items for which the model provides a good fit and those for which it does not must be determined. To this end, we compare the performance of several fit statistics for item pairs with known asymptotic distributions under maximum likelihood estimation of the item parameters: (a) a mean and variance adjustment to bivariate Pearson's X(2), (b) a bivariate subtable analog to Reiser's (1996) overall goodness-of-fit test, (c) a z statistic for the bivariate residual cross product, and (d) Maydeu-Olivares and Joe's (2006) M2 statistic applied to bivariate subtables. The unadjusted Pearson's X(2) with heuristically determined degrees of freedom is also included in the comparison. For binary and ordinal data, our simulation results suggest that the z statistic has the best Type I error and power behavior among all the statistics under investigation when the observed information matrix is used in its computation. However, if one has to use the cross-product information, the mean and variance adjusted X(2) is recommended. We illustrate the use of pairwise fit statistics in 2 real-data examples and discuss possible extensions of the current research in various directions.
当项目反应理论模型拟合不充分时,必须确定模型拟合良好的项目和拟合不佳的项目。为此,我们在项目参数的最大似然估计下,比较了几种拟合统计量对具有已知渐近分布的项目对的性能:(a) 对双变量皮尔逊卡方的均值和方差调整;(b) 与赖泽尔(1996年)总体拟合优度检验类似的双变量子表检验;(c) 双变量残差交叉乘积的z统计量;(d) 应用于双变量子表的梅德乌-奥利瓦雷斯和乔(2006年)的M2统计量。比较中还包括具有启发式确定自由度的未调整皮尔逊卡方。对于二元和有序数据,我们的模拟结果表明,当在计算中使用观测信息矩阵时,z统计量在所有研究的统计量中具有最佳的I型错误和检验功效表现。然而,如果必须使用交叉乘积信息,建议使用均值和方差调整后的卡方。我们在两个实际数据示例中说明了成对拟合统计量的使用,并讨论了当前研究在各个方向上可能的扩展。