Zijlmans Eva A O, van der Ark L Andries, Tijmstra Jesper, Sijtsma Klaas
Tilburg University, Tilburg, Netherlands.
University of Amsterdam, Amsterdam, Netherlands.
Appl Psychol Meas. 2018 Oct;42(7):553-570. doi: 10.1177/0146621618758290. Epub 2018 Apr 9.
Reliability is usually estimated for a test score, but it can also be estimated for item scores. Item-score reliability can be useful to assess the item's contribution to the test score's reliability, for identifying unreliable scores in aberrant item-score patterns in person-fit analysis, and for selecting the most reliable item from a test to use as a single-item measure. Four methods were discussed for estimating item-score reliability: the Molenaar-Sijtsma method (method MS), Guttman's method , the latent class reliability coefficient (method LCRC), and the correction for attenuation (method CA). A simulation study was used to compare the methods with respect to median bias, variability (interquartile range [IQR]), and percentage of outliers. The simulation study consisted of six conditions: standard, polytomous items, unequal parameters, two-dimensional data, long test, and small sample size. Methods MS and CA were the most accurate. Method LCRC showed almost unbiased results, but large variability. Method consistently underestimated item-score reliabilty, but showed a smaller IQR than the other methods.
信度通常是针对测验分数进行估计,但也可以针对项目分数进行估计。项目分数信度对于评估项目对测验分数信度的贡献、在个体拟合分析中识别异常项目分数模式下不可靠的分数以及从测验中选择最可靠的项目用作单项测量都很有用。讨论了四种估计项目分数信度的方法:莫伦纳尔 - 西茨马方法(方法MS)、古特曼方法、潜在类别信度系数(方法LCRC)以及衰减校正(方法CA)。进行了一项模拟研究,以比较这些方法在中位数偏差、变异性(四分位距[IQR])和异常值百分比方面的情况。模拟研究包括六种条件:标准条件、多分类项目、参数不等、二维数据、长测验和小样本量。方法MS和CA最为准确。方法LCRC显示出几乎无偏差的结果,但变异性较大。方法始终低估项目分数信度,但显示出比其他方法更小的IQR。