Muschelli John
Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 615 N Wolfe St, Baltimore, MD 21205.
J Classif. 2020 Oct;37(3):696-708. doi: 10.1007/s00357-019-09345-1. Epub 2019 Dec 23.
In analysis of binary outcomes, the receiver operator characteristic (ROC) curve is heavily used to show the performance of a model or algorithm. The ROC curve is informative about the performance over a series of thresholds and can be summarized by the area under the curve (AUC), a single number. When a is categorical, the ROC curve has one less than number of categories as potential thresholds; when the predictor is binary there is only one threshold. As the AUC may be used in decision-making processes on determining the best model, it important to discuss how it agrees with the intuition from the ROC curve. We discuss how the interpolation of the curve between thresholds with binary predictors can largely change the AUC. Overall, we show using a linear interpolation from the ROC curve with binary predictors corresponds to the estimated AUC, which is most commonly done in software, which we believe can lead to misleading results. We compare R, Python, Stata, and SAS software implementations. We recommend using reporting the interpolation used and discuss the merit of using the step function interpolator, also referred to as the "pessimistic" approach by Fawcett (2006).
在二元结果分析中,接收者操作特征(ROC)曲线被大量用于展示模型或算法的性能。ROC曲线能反映在一系列阈值下的性能情况,并且可以用曲线下面积(AUC)这个单一数值来概括。当结果是分类变量时,ROC曲线的潜在阈值数量比类别数少一个;当预测变量是二元变量时,只有一个阈值。由于AUC可用于确定最佳模型的决策过程,因此讨论它如何与ROC曲线的直观表现相符很重要。我们讨论了使用二元预测变量时,阈值之间曲线的插值如何能极大地改变AUC。总体而言,我们表明使用二元预测变量的ROC曲线进行线性插值与估计的AUC相对应,这在软件中是最常见的做法,我们认为这可能会导致误导性结果。我们比较了R、Python、Stata和SAS软件的实现。我们建议报告所使用的插值方法,并讨论使用阶跃函数插值器的优点,Fawcett(2006)也将其称为“悲观”方法。