Hallgren Kevin A
University of New Mexico, Department of Psychology.
Tutor Quant Methods Psychol. 2012;8(1):23-34. doi: 10.20982/tqmp.08.1.p023.
Many research designs require the assessment of inter-rater reliability (IRR) to demonstrate consistency among observational ratings provided by multiple coders. However, many studies use incorrect statistical procedures, fail to fully report the information necessary to interpret their results, or do not address how IRR affects the power of their subsequent analyses for hypothesis testing. This paper provides an overview of methodological issues related to the assessment of IRR with a focus on study design, selection of appropriate statistics, and the computation, interpretation, and reporting of some commonly-used IRR statistics. Computational examples include SPSS and R syntax for computing Cohen's kappa and intra-class correlations to assess IRR.
许多研究设计需要评估评分者间信度(IRR),以证明多个编码员提供的观察评分之间的一致性。然而,许多研究使用了不正确的统计程序,未能充分报告解释其结果所需的信息,或者没有说明IRR如何影响其后续假设检验分析的功效。本文概述了与IRR评估相关的方法学问题,重点关注研究设计、适当统计方法的选择,以及一些常用IRR统计量的计算、解释和报告。计算示例包括用于计算科恩kappa系数和组内相关性以评估IRR的SPSS和R语法。