Guggenmoos-Holzmann I
Institute of Medical Statistics and Information Science, Freie Universität Berlin, Germany.
J Clin Epidemiol. 1996 Jul;49(7):775-82. doi: 10.1016/0895-4356(96)00011-x.
A framework--the "agreement concept"--is developed to study the use of Cohen's kappa as well as alternative measures of chance-corrected agreement in a unified manner. Focusing on intrarater consistency it is demonstrated that for 2 x 2 tables an adequate choice between different measures of chance-corrected agreement can be made only if the characteristics of the observational setting are taken into account. In particular, a naive use of Cohen's kappa may lead to strikingly overoptimistic estimates of chance-corrected agreement. Such bias can be overcome by more elaborate study designs that allow for an unrestricted estimation of the probabilities at issue. When Cohen's kappa is appropriately applied as a measure of chance-corrected agreement, its values prove to be a linear--and not a parabolic--function of true prevalence. It is further shown how the validity of ratings is influenced by lack of consistency. Depending on the design of a validity study, this may lead, on purely formal grounds, to prevalence-dependent estimates of sensitivity and specificity. Proposed formulas for "chance-corrected" validity indexes fail to adjust for this phenomenon.
本文构建了一个框架——“一致性概念”,以便以统一的方式研究科恩kappa系数以及其他机会校正一致性的替代指标。聚焦于评分者内一致性,研究表明,对于2×2列联表,只有考虑到观察环境的特征,才能在不同的机会校正一致性指标之间做出恰当选择。特别是,单纯使用科恩kappa系数可能会导致对机会校正一致性的估计明显过于乐观。这种偏差可以通过更精细的研究设计来克服,这些设计允许对相关概率进行无限制估计。当科恩kappa系数作为机会校正一致性的指标被恰当地应用时,其值被证明是真实患病率的线性函数,而非抛物线函数。进一步表明了评分的有效性是如何受到缺乏一致性的影响。根据效度研究的设计,纯粹基于形式上的原因,这可能导致对敏感性和特异性的患病率依赖性估计。“机会校正”效度指标的建议公式未能针对这一现象进行调整。