Obuchowski N A, Applegate K E, Goske M J, Arheart K L, Myers M T, Morrison S
Department of Biostatistics and Epidemiology, The Cleveland Clinic Foundation, OH 44195, USA.
Acad Radiol. 2001 Oct;8(10):947-54. doi: 10.1016/S1076-6332(03)80638-4.
In practice readers must often choose between multiple diagnoses. For assessing reader accuracy in these settings. Obuchowski et al have proposed the "differential diagnosis" method, which derives all pairwise estimates of accuracy for the various diagnoses, along with summary measures of accuracy. The current study assessed the correspondence between the differential diagnosis method and conventional binary-truth state experiments.
Two empirical studies were conducted at two institutions with different readers and diagnostic tests. Readers used the differential diagnosis format to interpret a set of cases. In subsequent readings they interpreted the cases in binary-truth state experiments. Spearman rank correlation coefficients and the percentages of agreement in scores were computed, and the areas under the receiver operating characteristic curves were estimated and compared.
The between-format Spearman rank correlation coefficients were 0.697-0.718 and 0.750-0.780 for the two studies; the between-reader correlations were 0.417 and 0.792, respectively. The percentages of agreement between formats for the two studies were 50.0%-51.7% and 72.9%-78.8%; the percentages of agreement between readers were 45.0% and 80%, respectively. In the first study there were several significant differences in the areas under receiver operating characteristic curves; in the second study these differences were small.
The differences observed between the two formats can be attributed to within-reader variability and inherent differences in the questions posed to readers in the multiple-diagnoses versus binary-truth state reading sessions. The differential diagnosis format is useful for estimating accuracy when there are multiple possible diagnoses.
在实际操作中,读者常常需要在多种诊断结果之间进行选择。为评估读者在这些情况下的诊断准确性,奥布霍夫斯基等人提出了“鉴别诊断”方法,该方法可得出各种诊断的所有成对准确性估计值以及准确性的汇总指标。本研究评估了鉴别诊断方法与传统二分类真实状态实验之间的一致性。
在两个机构针对不同的读者和诊断测试进行了两项实证研究。读者使用鉴别诊断格式解读一组病例。在随后的阅读中,他们在二分类真实状态实验中解读这些病例。计算了斯皮尔曼等级相关系数和分数一致率,并估计和比较了受试者操作特征曲线下的面积。
两项研究中,两种格式之间的斯皮尔曼等级相关系数分别为0.697 - 0.718和0.750 - 0.780;读者之间的相关系数分别为0.417和0.792。两项研究中两种格式之间的一致率分别为50.0% - 51.7%和72.9% - 78.8%;读者之间的一致率分别为45.0%和80%。在第一项研究中,受试者操作特征曲线下的面积存在若干显著差异;在第二项研究中,这些差异较小。
两种格式之间观察到的差异可归因于读者内部的变异性以及在多诊断与二分类真实状态阅读环节中向读者提出的问题的固有差异。当存在多种可能的诊断时,鉴别诊断格式有助于估计准确性。