Division of Imaging and Applied Mathematics, OSEL, CDRH, U.S. Food and Drug Administration, Silver Spring, Maryland 20993.
Med Phys. 2013 Nov;40(11):111903. doi: 10.1118/1.4823755.
Studies of lesion detectability are often carried out to evaluate medical imaging technology. For such studies, several approaches have been proposed to measure observer performance, such as the receiver operating characteristic (ROC), the localization ROC (LROC), the free-response ROC (FROC), the alternative free-response ROC (AFROC), and the exponentially transformed FROC (EFROC) paradigms. Therefore, an experimenter seeking to carry out such a study is confronted with an array of choices. Traditionally, arguments for different approaches have been made on the basis of practical considerations (statistical power, etc.) or the gross level of analysis (case-level or lesion-level). This article contends that a careful consideration of utility should form the rationale for matching the assessment paradigm to the clinical task of interest.
In utility theory, task performance is commonly evaluated with total expected utility, which integrates the various event utilities against the probability of each event. To formalize the relationship between expected utility and the summary curve associated with each assessment paradigm, the concept of a "natural" utility structure is proposed. A natural utility structure is defined for a summary curve when the variables associated with the summary curve axes are sufficient for computing total expected utility, assuming that the disease prevalence is known.
Natural utility structures for ROC, LROC, FROC, AFROC, and EFROC curves are introduced, clarifying how the utilities of correct and incorrect decisions are aggregated by summary curves. Further, conditions are given under which general utility structures for localization-based methodologies reduce to case-based assessment.
Overall, the findings reveal how summary curves correspond to natural utility structures of diagnostic tasks, suggesting utility as a motivating principle for choosing an assessment paradigm.
病变检测能力的研究通常用于评估医学成像技术。对于此类研究,已经提出了几种方法来衡量观察者的性能,例如接收者操作特征(ROC)、定位 ROC(LROC)、自由响应 ROC(FROC)、替代自由响应 ROC(AFROC)和指数变换 FROC(EFROC)范式。因此,寻求进行此类研究的实验者面临着一系列选择。传统上,对于不同方法的争论是基于实际考虑因素(统计功效等)或分析的总体水平(病例水平或病变水平)提出的。本文认为,应仔细考虑效用,将评估范式与感兴趣的临床任务相匹配。
在效用理论中,通常使用总期望效用来评估任务绩效,该效用将各种事件效用与每个事件的概率相结合。为了使预期效用与与每个评估范式相关联的汇总曲线之间的关系形式化,提出了“自然”效用结构的概念。当与汇总曲线轴相关的变量足以计算总期望效用时,定义了自然效用结构,假设已知疾病的流行率。
介绍了 ROC、LROC、FROC、AFROC 和 EFROC 曲线的自然效用结构,阐明了汇总曲线如何聚合正确和错误决策的效用。此外,给出了基于定位的方法的一般效用结构简化为基于病例的评估的条件。
总体而言,研究结果揭示了汇总曲线如何对应于诊断任务的自然效用结构,表明效用是选择评估范式的动机原则。