Wolfe Edward W, Song Tian
Pearson, 3974 Roberts Ridge NE, Iowa City, IA 52240, USA,
J Appl Meas. 2014;15(2):152-9.
A large body of literature exists describing how rater effects may be detected in rating data. In this study, we compared the flag and agreement rates for several rater effects based on calibration of a real data under two psychometric models-the Rasch rating scale model (RSM) and the Rasch testlet-based rater bundle model (RBM). The results show that the RBM provided more accurate diagnoses of rater severity and leniency than do the RSM which is based on the local independence assumption. However, the statistical indicators associated with rater centrality and inaccuracy remain consistent between these two models.
有大量文献描述了如何在评分数据中检测评分者效应。在本研究中,我们基于两个心理测量模型——拉施评分量表模型(RSM)和基于拉施测验的评分者束模型(RBM),对真实数据进行校准,比较了几种评分者效应的标记率和一致率。结果表明,与基于局部独立性假设的RSM相比,RBM能更准确地诊断评分者的严格程度和宽松程度。然而,这两个模型中与评分者中心性和不准确相关的统计指标保持一致。