Wagner Robert F, Beam Craig A, Beiden Sergey V
Office of Science and Technology, Center for Devices & Radiological Health, Food and Drug Administration, Rockville, Maryland 20850, USA.
Med Decis Making. 2004 Nov-Dec;24(6):561-72. doi: 10.1177/0272989X04271043.
The multiple-reader, multiple-case (MRMC) approach to receiver operating characteristic (ROC) analysis is becoming the dominant assessment paradigm in medical imaging. Its most common version involves having many readers read every patient case in the study, a critical feature since differences among competing imaging modalities are often dominated by differences in reader performance. The present authors have carried out MRMC ROC analysis on a uniquely large data set for mammography. The analysis quantifies the great range of observed reader skill in that data set. It also demonstrates that the sample sizes are sufficiently large that the conclusions generalize to the populations sampled here with little uncertainty from the finite sample size. A schematic approach to bracketing the utility matrix is then used to study trends in the resulting expected utility functions that correspond to the range of observed ROC curves. This is done for both the screening and the diagnostic context. The results raise 2 hypotheses for further investigation. First, it is possible that the present ambiguity surrounding the effectiveness of mammography is due in part to the observed range of reader skills and corresponding expected utility functions. Second, it is possible that computer-assisted modalities for mammography may lead to improvements in the expected utility function not only for screening but also in the diagnostic context, especially for the lower performing readers.
用于接收器操作特性(ROC)分析的多读者、多病例(MRMC)方法正成为医学成像领域的主导评估范式。其最常见的形式是让许多读者阅读研究中的每个患者病例,这是一个关键特征,因为竞争成像模态之间的差异往往由读者表现的差异主导。本文作者对一个独特的大型乳腺X线摄影数据集进行了MRMC ROC分析。该分析量化了该数据集中观察到的读者技能的巨大差异范围。它还表明样本量足够大,以至于结论可以推广到此处抽样的人群,而几乎不会因有限样本量产生不确定性。然后使用一种对效用矩阵进行界定的示意性方法来研究与观察到的ROC曲线范围相对应的所得预期效用函数的趋势。这在筛查和诊断背景下均已完成。结果提出了2个有待进一步研究的假设。第一,目前围绕乳腺X线摄影有效性的模糊性可能部分归因于观察到的读者技能差异范围和相应的预期效用函数。第二,乳腺X线摄影的计算机辅助模态不仅可能在筛查方面,而且在诊断背景下,尤其是对于表现较差的读者,可能会导致预期效用函数的改善。