Wixted John T, Mickes Laura
1Department of Psychology, University of California, San Diego, CA USA.
2Department of Psychology, Royal Holloway, University of London, London, UK.
Cogn Res Princ Implic. 2018;3(1):9. doi: 10.1186/s41235-018-0093-8. Epub 2018 Mar 14.
Receiver operating characteristic (ROC) analysis was introduced to the field of eyewitness identification 5 years ago. Since that time, it has been both influential and controversial, and the debate has raised an issue about measuring discriminability that is rarely considered. The issue concerns the distinction between empirical discriminability (measured by area under the ROC curve) vs. underlying/theoretical discriminability (measured by or variants of it). Under most circumstances, the two measures will agree about a difference between two conditions in terms of discriminability. However, it is possible for them to disagree, and that fact can lead to confusion about which condition actually yields higher discriminability. For example, if the two conditions have implications for real-world practice (e.g., a comparison of competing lineup formats), should a policymaker rely on the area-under-the-curve measure or the theory-based measure? Here, we illustrate the fact that a given empirical ROC yields as many underlying discriminability measures as there are theories that one is willing to take seriously. No matter which theory is correct, for practical purposes, the singular area-under-the-curve measure best identifies the diagnostically superior procedure. For that reason, area under the ROC curve informs policy in a way that underlying theoretical discriminability never can. At the same time, theoretical measures of discriminability are equally important, but for a different reason. Without an adequate theoretical understanding of the relevant task, the field will be in no position to enhance empirical discriminability.
5年前,接收者操作特征(ROC)分析被引入目击证人识别领域。自那时以来,它一直具有影响力且颇具争议,这场争论引发了一个关于衡量可辨别性的问题,而这个问题很少被考虑到。该问题涉及经验性可辨别性(通过ROC曲线下面积衡量)与潜在/理论性可辨别性(通过 或其变体衡量)之间的区别。在大多数情况下,这两种衡量方法在两种条件在可辨别性方面的差异上会达成一致。然而,它们有可能出现分歧,而这一事实可能会导致对于哪种条件实际上产生更高可辨别性的困惑。例如,如果这两种条件对现实世界的实践有影响(例如,对相互竞争的列队辨认形式进行比较),政策制定者应该依赖曲线下面积衡量方法还是基于理论的衡量方法呢?在此,我们说明这样一个事实,即给定的经验性ROC会产生与人们愿意认真对待的理论数量一样多的潜在可辨别性衡量方法。无论哪种理论是正确的,出于实际目的,单一的曲线下面积衡量方法最能识别出诊断上更优的程序。因此,ROC曲线下面积以一种潜在理论可辨别性永远无法做到的方式为政策提供信息。与此同时,可辨别性的理论衡量方法同样重要,但原因不同。如果对相关任务没有充分的理论理解,该领域将无法提高经验性可辨别性。