Verhavert San, De Maeyer Sven, Donche Vincent, Coertjens Liesje
University of Antwerp, Belgium.
Université Catholique de Louvain, Louvain-la-Neuve, Belgium.
Appl Psychol Meas. 2018 Sep;42(6):428-445. doi: 10.1177/0146621617748321. Epub 2017 Dec 31.
Comparative judgment (CJ) is an alternative method for assessing competences based on Thurstone's law of comparative judgment. Assessors are asked to compare pairs of students work (representations) and judge which one is better on a certain competence. These judgments are analyzed using the Bradly-Terry-Luce model resulting in logit estimates for the representations. In this context, the Scale Separation Reliability (SSR), coming from Rasch modeling, is typically used as reliability measure. But, to the knowledge of the authors, it has never been systematically investigated if the meaning of the SSR can be transferred from Rasch to CJ. As the meaning of the reliability is an important question for both assessment theory and practice, the current study looks into this. A meta-analysis is performed on 26 CJ assessments. For every assessment, split-halves are performed based on assessor. The rank orders of the whole assessment and the halves are correlated and compared with SSR values using Bland-Altman plots. The correlation between the halves of an assessment was compared with the SSR of the whole assessment showing that the SSR is a good measure for split-half reliability. Comparing the SSR of one of the halves with the correlation between the two respective halves showed that the SSR can also be interpreted as an interrater correlation. Regarding SSR as expressing a correlation with the truth, the results are mixed.
比较判断(CJ)是一种基于瑟斯顿比较判断定律来评估能力的替代方法。评估者被要求比较成对的学生作品(表现),并判断哪一个在某一能力方面更好。使用布拉德利 - 特里 - 卢斯模型对这些判断进行分析,得出表现的对数估计值。在这种情况下,源自拉施模型的量表分离信度(SSR)通常被用作信度度量。但是,据作者所知,SSR的含义是否可以从拉施模型转移到CJ模型从未得到系统研究。由于信度的含义对于评估理论和实践都是一个重要问题,当前的研究对此进行了探讨。对26项CJ评估进行了荟萃分析。对于每项评估,基于评估者进行对半拆分。将整个评估与各半部分的排名顺序进行关联,并使用布兰德 - 奥特曼图与SSR值进行比较。评估各半部分之间的相关性与整个评估的SSR进行比较,结果表明SSR是对半信度的良好度量。将其中一个半部分的SSR与两个相应半部分之间的相关性进行比较,结果表明SSR也可以解释为评分者间的相关性。将SSR视为与真实情况的相关性,结果喜忧参半。