Svensson E, Holm S
Department of Mathematics, Chalmers University of Technology, Göteborg, Sweden.
Stat Med. 1994;13(23-24):2437-53. doi: 10.1002/sim.4780132308.
We introduce a new statistical method, which separates and measures different types of variability between paired ordered categorical measurements. The key to the separation is a two-way augmented ranking approach of observations in a contingency table. It means that cases classified in a specific category by one rater will be internally ranked according to the classifications from the other. This enables us to extract the component of interobserver variation which is not systematic. The variance of the rank differences between judgements is a suitable measure of this interrater variability, which we characterize as random. The empirical measure of random interjudge disagreement, which lies between 0 and 1, is called the relative rank variance and is an estimate of a parameter defined on the multinomial probability distribution in the contingency table. The systematic differences are determined by the marginals and described by two empirical measures, relative position and relative concentration; both measures lie between -1 and 1. Our method is applied to data sets from a reliability study of two clinical rating scales for assessing hydrocephalus and subarachnoid haemorrhage.
我们引入了一种新的统计方法,该方法用于分离和测量配对有序分类测量之间的不同类型变异性。分离的关键在于列联表中观测值的双向增强排序方法。这意味着由一名评估者分类到特定类别的病例将根据另一名评估者的分类进行内部排序。这使我们能够提取观察者间变异中不具有系统性的成分。判断之间的秩差方差是这种评估者间变异性的合适度量,我们将其表征为随机的。随机判断不一致的经验度量介于0和1之间,称为相对秩方差,它是列联表中多项概率分布上定义的一个参数的估计值。系统差异由边缘值确定,并由两个经验度量来描述,即相对位置和相对集中度;这两个度量都介于-1和1之间。我们的方法应用于来自两项用于评估脑积水和蛛网膜下腔出血的临床评分量表可靠性研究的数据集。