Bioestadística, Facultad de Medicina, Universidad de Granada, Granada, Spain.
Centro Universitario de la Defensa - ENM, Universidad de Vigo, Vigo, Pontevedra, Spain.
Br J Math Stat Psychol. 2020 Feb;73(1):1-22. doi: 10.1111/bmsp.12167. Epub 2019 May 6.
There is a frequent need to measure the degree of agreement among R observers who independently classify n subjects within K nominal or ordinal categories. The most popular methods are usually kappa-type measurements. When R = 2, Cohen's kappa coefficient (weighted or not) is well known. When defined in the ordinal case while assuming quadratic weights, Cohen's kappa has the advantage of coinciding with the intraclass and concordance correlation coefficients. When R > 2, there are more discrepancies because the definition of the kappa coefficient depends on how the phrase 'an agreement has occurred' is interpreted. In this paper, Hubert's interpretation, that 'an agreement occurs if and only if all raters agree on the categorization of an object', is used, which leads to Hubert's (nominal) and Schuster and Smith's (ordinal) kappa coefficients. Formulae for the large-sample variances for the estimators of all these coefficients are given, allowing the latter to illustrate the different ways of carrying out inference and, with the use of simulation, to select the optimal procedure. In addition, it is shown that Schuster and Smith's kappa coefficient coincides with the intraclass and concordance correlation coefficients if the first coefficient is also defined assuming quadratic weights.
在医学研究中,经常需要衡量 R 位观察者在 K 个名义或有序类别中独立分类 n 个对象的一致性程度。最常用的方法通常是 Kappa 型测量。当 R=2 时,Cohen 的 Kappa 系数(加权或未加权)是众所周知的。当在有序情况下定义并假设二次权重时,Cohen 的 Kappa 系数与组内相关系数和一致性相关系数一致。当 R>2 时,会有更多的差异,因为 Kappa 系数的定义取决于如何解释“已经达成一致”这句话。在本文中,使用了 Hubert 的解释,即“如果所有评分者都同意对一个对象的分类,则达成一致”,这导致了 Hubert(名义)和 Schuster 和 Smith(有序)的 Kappa 系数。给出了所有这些系数估计量的大样本方差公式,允许后者说明进行推理的不同方法,并通过模拟选择最佳程序。此外,如果也假设二次权重来定义第一个系数,则 Schuster 和 Smith 的 Kappa 系数与组内相关系数和一致性相关系数一致。