Rousson Valentin, Gasser Theo, Seifert Burkhardt
Department of Biostatistics, Institute for Social and Preventive Medicine, University of Zurich, Sumatrastrasse 30, CH-8006 Zurich, Switzerland.
Stat Med. 2002 Nov 30;21(22):3431-46. doi: 10.1002/sim.1253.
In this paper we review the problem of defining and estimating intrarater, interrater and test-retest reliability of continuous measurements. We argue that the usual notion of product-moment correlation is well adapted in a test-retest situation, whereas the concept of intraclass correlation should be used for intrarater and interrater reliability. The key difference between these two approaches is the treatment of systematic error, which is often due to a learning effect for test-retest data. We also consider the reliability of a sum and a difference of variables and illustrate the effects on components. Further, we compare these approaches of reliability with the concept of limits of agreement proposed by Bland and Altman (for evaluating the agreement between two methods of clinical measurements) and show how product-moment correlation is related to it. We then propose new kinds of limits of agreement which are related to intraclass correlation. A test battery to study the development of neuro-motor functions in children and adolescents illustrates our purpose throughout the paper.
在本文中,我们回顾了连续测量的评分者内、评分者间以及重测信度的定义和估计问题。我们认为,积差相关的通常概念在重测情况下适用良好,而组内相关概念应用于评分者内和评分者间信度。这两种方法的关键区别在于系统误差的处理,重测数据中的系统误差通常是由于学习效应导致的。我们还考虑了变量之和与差的信度,并说明了对各组成部分的影响。此外,我们将这些信度方法与布兰德和奥特曼提出的一致性界限概念(用于评估两种临床测量方法之间的一致性)进行比较,并展示积差相关与它的关系。然后,我们提出了与组内相关相关的新型一致性界限。贯穿本文,一项用于研究儿童和青少年神经运动功能发育的测试组合说明了我们的目的。