Thomsen Niels O B, Olsen Lars H, Nielsen Steen T
Department of Radiology, Sønderborg Hospital, DK-6400 Sønderborg, Denmark.
J Orthop Sci. 2002;7(2):163-6. doi: 10.1007/s007760200028.
Studies using kappa statistics have been conducted with a varied but limited number of observers. The aim of this study was to evaluate the significance of multiple observers on kappa as a measure of observer variation. One hundred orthopedic specialists were asked to assess a random sample of ten sets of standard radiographs of 94 consecutive patients with ankle fractures. The observers were randomly allocated into four groups, which again were divided into subgroups with an increasing number of observers. Random subgroups of three observers revealed kappa values from 0.20 to 0.64 in the Lauge-Hansen and 0.27 to 0.90 in the Weber classification system. With an increasing number of observers in the subgroups, kappa stabilizes around a mean value, indicating that the sampling variation and standard error decrease. The standard error found in this study makes kappa questionable as a measure for agreement among a small number of observers. Thus, kappa values obtained for a given diagnostic tool at one department are not directly comparable with results from other departments. We conclude that kappa cannot stand alone as a simple measure of observer variation.
已有研究使用卡帕统计量对数量各异但有限的观察者进行了分析。本研究的目的是评估多名观察者对作为观察者变异度量指标的卡帕值的意义。邀请了100名骨科专家对94例连续踝关节骨折患者的10套标准X光片随机样本进行评估。观察者被随机分为四组,每组又根据观察者数量的增加进一步分为多个亚组。由三名观察者组成的随机亚组在Lauge-Hansen分类系统中的卡帕值为0.20至0.64,在Weber分类系统中为0.27至0.90。随着亚组中观察者数量的增加,卡帕值围绕一个平均值趋于稳定,这表明抽样变异和标准误差在减小。本研究中发现的标准误差使得卡帕值作为少数观察者之间一致性的度量指标存在疑问。因此,在一个科室针对某一诊断工具获得的卡帕值与其他科室的结果无法直接比较。我们得出结论,卡帕值不能单独作为观察者变异的简单度量指标。