Rockette H E, King J L, Thaete F L, Fuhrman C R, Slifko R M, Gur D
Department of Biostatistics, University of Pittsburgh, Pa., USA.
Acad Radiol. 1998 Feb;5(2):86-92. doi: 10.1016/s1076-6332(98)80127-x.
To assess the usefulness of classifying degree of difficulty in abnormality detection and to determine the effect of knowing the true diagnosis when selecting subtle images for observer-performance studies.
A total of 529 posteroanterior chest images that had been used in a multiabnormality, multireader observer-performance study were rated by three observers as to the difficulty of determining the presence or absence of each abnormality when the true diagnosis was known and when it was not known. Changes in image subtlety ratings were evaluated, and actual observer-performance results for the different groups of images grouped according to raters' classifications with and without availability of the true diagnosis were compared.
The majority of negative cases (9,168 of 12,258, 74.8%) were rated as "easy" to determine. Substantial changes were made during the selection of the "subtle" case category when the truth was known compared with when the truth was not provided. These changes caused differences between typical and subtle cases in terms of observer performance. Combined ratings of case subtlety by agreement of multiple classifiers resulted in a well-ordered selection with decreasing observer performance as a function of subtlety ratings.
Cases for observer-performance studies that stress the diagnostic system can be successfully selected in the multiple-disease setting by experienced readers and should be selected with the truth known to the raters. The degree of agreement by multiple raters can be used to refine subtlety ratings.
评估在异常检测中对难度程度进行分类的有用性,并确定在为观察者表现研究选择细微图像时知晓真实诊断的影响。
在一项多异常、多阅片者的观察者表现研究中使用的529张后前位胸部图像,由三位观察者对在知晓和不知晓真实诊断时确定每种异常是否存在的难度进行评级。评估图像细微程度评级的变化,并比较根据评级者分类在知晓和不知晓真实诊断情况下分组的不同图像组的实际观察者表现结果。
大多数阴性病例(12258例中的9168例,74.8%)被评为易于确定。与不知晓真相时相比,在知晓真相时选择“细微”病例类别期间发生了重大变化。这些变化导致典型病例和细微病例在观察者表现方面存在差异。通过多个分类器的一致意见对病例细微程度进行综合评级,得到了一个有序的选择,观察者表现随着细微程度评级的降低而下降。
在多种疾病背景下,经验丰富的阅片者可以成功选择用于强调诊断系统的观察者表现研究的病例,并且应该在评级者知晓真相的情况下进行选择。多个评级者的一致程度可用于完善细微程度评级。