Lester Kirchner H, Lemke Jon H
Department of Pediatrics, Rainbow Babies and Children's Hospital, Case Western Reserve University, Cleveland, OH 44106-6003, USA.
Stat Med. 2002 Jun 30;21(12):1761-72. doi: 10.1002/sim.1138.
It is valuable in many studies to assess both intrarater and interrater agreement. Most measures of intrarater agreement do not adjust for unequal estimates of prevalence between the separate rating occasions for a given rater and measures of interrater agreement typically ignore data from the second set of assessments when raters make duplicate assessments. In the event when both measures are assessed there are instances where interrater agreement is larger than at least one of the corresponding intrarater agreements. This implies that a rater agrees less with him/herself and more with another rater. In the situation of multiple raters making duplicate assessments on all subjects, the authors propose properties for an agreement measure based on the odds ratio for a dichotomous trait: (i) estimate a single prevalence across two reading occasions for each rater; (ii) estimate pairwise interrater agreement from all available data; (iii) bound the pairwise interrater agreement above by the corresponding intrarater agreements. Estimation of odds ratios under these properties is done by maximizing the multinomial likelihood with constraints using generalized log-linear models in combination with a generalization of the Lemke-Dykstra iterative-incremental algorithm. An example from a mammography examination reliability study is used to demonstrate the new method.
在许多研究中,评估评分者内一致性和评分者间一致性都很有价值。大多数评分者内一致性的测量方法没有针对给定评分者在不同评分场合下患病率的不平等估计进行调整,而评分者间一致性的测量方法通常会忽略评分者进行重复评估时第二组评估的数据。在同时评估这两种一致性的情况下,存在评分者间一致性大于至少一个相应评分者内一致性的情况。这意味着评分者与自己的一致性较低,而与另一位评分者的一致性较高。在多个评分者对所有受试者进行重复评估的情况下,作者基于二分类性状的优势比提出了一种一致性测量方法的属性:(i) 为每个评分者估计两个阅读场合的单一患病率;(ii) 根据所有可用数据估计评分者间的两两一致性;(iii) 将评分者间的两两一致性上限设定为相应的评分者内一致性。通过使用广义对数线性模型结合Lemke-Dykstra迭代增量算法的推广,在有约束的情况下最大化多项似然性来估计这些属性下的优势比。使用乳腺X线摄影检查可靠性研究的一个例子来演示这种新方法。