Zec Slavica, Soriani Nicola, Comoretto Rosanna, Baldi Ileana
Department of Cardiac, Thoracic and Vascular Sciences, Unit of Biostatistics, Epidemiology and Public Health, University of Padova, Padova, Italy.
Department of Statistics and quantitative methods, University of Milan, Bicocca, Italy.
Open Nurs J. 2017 Oct 31;11:211-218. doi: 10.2174/1874434601711010211. eCollection 2017.
Cohen's Kappa is the most used agreement statistic in literature. However, under certain conditions, it is affected by a paradox which returns biased estimates of the statistic itself.
The aim of the study is to provide sufficient information which allows the reader to make an informed choice of the correct agreement measure, by underlining some optimal properties of Gwet's AC1 in comparison to Cohen's Kappa, using a real data example.
During the process of literature review, we have asked a panel of three evaluators to come up with a judgment on the quality of 57 randomized controlled trials assigning a score to each trial using the Jadad scale. The quality was evaluated according to the following dimensions: adopted design, randomization unit, type of primary endpoint. With respect to each of the above described features, the agreement between the three evaluators has been calculated using Cohen's Kappa statistic and Gwet's AC1 statistic and, finally, the values have been compared with the observed agreement.
The values of the Cohen's Kappa statistic would lead to believe that the agreement levels for the variables Unit, Design and Primary Endpoints are totally unsatisfactory. The AC1 statistic, on the contrary, shows plausible values which are in line with the respective values of the observed concordance.
We conclude that it would always be appropriate to adopt the AC1 statistic, thus bypassing any risk of incurring the paradox and drawing wrong conclusions about the results of agreement analysis.
科恩卡方系数是文献中最常用的一致性统计量。然而,在某些情况下,它会受到一种悖论的影响,导致该统计量本身的估计值存在偏差。
本研究旨在通过使用一个实际数据示例,强调格韦特AC1相对于科恩卡方系数的一些最优属性,从而为读者提供足够的信息,使其能够明智地选择正确的一致性测量方法。
在文献综述过程中,我们邀请了一个由三名评估者组成的小组,对57项随机对照试验的质量进行判断,并使用雅达量表为每项试验打分。根据以下维度评估质量:采用的设计、随机化单位、主要终点类型。对于上述每个特征,使用科恩卡方统计量和格韦特AC1统计量计算三名评估者之间的一致性,最后将这些值与观察到的一致性进行比较。
科恩卡方统计量的值会让人认为变量“单位”“设计”和“主要终点”的一致性水平完全不令人满意。相反,AC1统计量显示出合理的值,与观察到的一致性的各自值相符。
我们得出结论,采用AC1统计量总是合适的,从而避免出现悖论的风险,并避免对一致性分析结果得出错误结论。