Haber Michael, Barnhart Huiman X
Department of Biostatistics, Rollins School of Public Health, Emory University, Atlanta, GA 30322, USA.
Stat Methods Med Res. 2008 Apr;17(2):151-69. doi: 10.1177/0962280206075527. Epub 2007 Aug 14.
We present a general approach to the definition and estimation of coefficients for evaluating agreement between two fixed methods of measurements or human observers. The measured variable is assumed to be continuous with a finite second moment. No other distributional assumptions are made. We introduce the term ;disagreement function' for the function of the observations that is used to quantify the extent of disagreement between the two measurements made on the same subject. The proposed inter-methods agreement coefficients compare the disagreement between measurements made by different methods on the same subject to the corresponding disagreement between replicated measurements made by the same method. Therefore, the new coefficients require data with replications readings. We propose inter-methods agreement coefficients for two practical situations involving two methods that have a measurement error: 1) comparison of a new method to a gold standard (or a reference method), and 2) comparison of two methods where neither method is considered a gold standard. We consider three disagreement functions based on the differences between two measurements: 1) the mean squared difference, 2) the mean absolute difference and 3) the mean relative difference. We then derive non-parametric estimates for the various agreement coefficients. Our approach is illustrated using data from a study comparing systolic blood pressure measurements by a human observer and an automatic monitor. The performance of the new estimates is assessed via stochastic simulations.
我们提出了一种通用方法,用于定义和估计系数,以评估两种固定测量方法或人类观察者之间的一致性。假设测量变量是连续的,且具有有限的二阶矩。不做其他分布假设。我们引入“不一致函数”一词,用于表示观测值的函数,该函数用于量化对同一受试者进行的两次测量之间的不一致程度。所提出的方法间一致性系数将不同方法对同一受试者进行测量时的不一致性与同一方法重复测量时的相应不一致性进行比较。因此,新系数需要有重复读数的数据。我们针对涉及两种存在测量误差的方法的两种实际情况提出了方法间一致性系数:1)将新方法与金标准(或参考方法)进行比较,2)比较两种均不被视为金标准的方法。我们基于两次测量之间的差异考虑了三种不一致函数:1)均方差,2)平均绝对差和3)平均相对差。然后,我们推导了各种一致性系数的非参数估计。我们使用一项比较人类观察者和自动监测仪测量收缩压的研究数据来说明我们的方法。通过随机模拟评估新估计的性能。