Gross S T
Biometrics. 1986 Dec;42(4):883-93.
Published results on the use of the kappa coefficient of agreement have traditionally been concerned with situations where a large number of subjects is classified by a small group of raters. The coefficient is then used to assess the degree of agreement among the raters through hypothesis testing or confidence intervals. A modified kappa coefficient of agreement for multiple categories is proposed and a parameter-free distribution for testing null agreement is provided, for use when the number of raters is large relative to the number of categories and subjects. The large-sample distribution of kappa is shown to be normal in the nonnull case, and confidence intervals for kappa are provided. The results are extended to allow for an unequal number of raters per subject.
关于一致性kappa系数的已发表结果传统上关注的是由一小群评估者对大量受试者进行分类的情况。然后,该系数通过假设检验或置信区间用于评估评估者之间的一致程度。本文提出了一种适用于多类别的修正一致性kappa系数,并给出了用于检验零一致性的无参数分布,当评估者数量相对于类别和受试者数量较多时使用。结果表明,在非零假设情况下kappa的大样本分布是正态的,并给出了kappa的置信区间。研究结果进一步扩展,以允许每个受试者的评估者数量不相等。