Department of Epidemiology and Biostatistics, EMGO Institute for Health and Care Research, VU University Medical Center, Amsterdam, Netherlands.
BMJ. 2013 Apr 12;346:f2125. doi: 10.1136/bmj.f2125.
Clinicians are interested in observer variation in terms of the probability of other raters (interobserver) or themselves (intraobserver) obtaining the same answer. Cohen's κ is commonly used in the medical literature to express such agreement in categorical outcomes. The value of Cohen's κ, however, is not sufficiently informative because it is a relative measure, while the clinician's question of observer variation calls for an absolute measure. Using an example in which the observed agreement and κ lead to different conclusions, we illustrate that percentage agreement is an absolute measure (a measure of agreement) and that κ is a relative measure (a measure of reliability). For the data to be useful for clinicians, measures of agreement should be used. The proportion of specific agreement, expressing the agreement separately for the positive and the negative ratings, is the most appropriate measure for conveying the relevant information in a 2 × 2 table and is most informative for clinicians.
临床医生关注观察者之间的差异,即其他评估者(观察者间)或他们自己(观察者内)获得相同答案的可能性。Cohen's κ 常用于医学文献中,以表示分类结果的一致性。然而,Cohen's κ 的值信息量不足,因为它是一个相对度量,而临床医生对观察者差异的问题需要一个绝对的度量。我们通过一个例子来说明,在这个例子中,观察到的一致性和 κ 导致了不同的结论,我们说明百分比一致性是一个绝对度量(一致性的度量),而 κ 是一个相对度量(可靠性的度量)。为了使数据对临床医生有用,应该使用一致性度量。具体一致性的比例,分别表示阳性和阴性评分的一致性,是在 2×2 表中传达相关信息的最合适度量,对临床医生最具信息量。