Bandos Andriy I, Guo Ben, Gur David
Department of Biostatistics, University of Pittsburgh, 7137 Parran Hall, 130 DeSoto, Str., Pittsburgh, PA 15146.
Department of Biostatistics, University of Pittsburgh, 7137 Parran Hall, 130 DeSoto, Str., Pittsburgh, PA 15146.
Acad Radiol. 2017 Feb;24(2):209-219. doi: 10.1016/j.acra.2016.09.020. Epub 2016 Nov 21.
The "binormal" model is the most frequently used tool for parametric receiver operating characteristic (ROC) analysis. The binormal ROC curves can have "improper" (non-concave) shapes that are unrealistic in many practical applications, and several tools (eg, PROPROC) have been developed to address this problem. However, due to the general robustness of binormal ROCs, the improperness of the fitted curves might carry little consequence for inferences about global summary indices, such as the area under the ROC curve (AUC). In this work, we investigate the effect of severe improperness of fitted binormal ROC curves on the reliability of AUC estimates when the data arise from an actually proper curve.
We designed theoretically proper ROC scenarios that induce severely improper shape of fitted binormal curves in the presence of well-distributed empirical ROC points. The binormal curves were fitted using maximum likelihood approach. Using simulations, we estimated the frequency of severely improper fitted curves, bias of the estimated AUC, and coverage of 95% confidence intervals (CIs). In Appendix S1, we provide additional information on percentiles of the distribution of AUC estimates and bias when estimating partial AUCs. We also compared the results to a reference standard provided by empirical estimates obtained from continuous data.
We observed up to 96% of severely improper curves depending on the scenario in question. The bias in the binormal AUC estimates was very small and the coverage of the CIs was close to nominal, whereas the estimates of partial AUC were biased upward in the high specificity range and downward in the low specificity range. Compared to a non-parametric approach, the binormal model led to slightly more variable AUC estimates, but at the same time to CIs with more appropriate coverage.
The improper shape of the fitted binormal curve, by itself, ie, in the presence of a sufficient number of well-distributed points, does not imply unreliable AUC-based inferences.
“双正态”模型是参数化接收器操作特性(ROC)分析中最常用的工具。双正态ROC曲线可能具有“不合适”(非凹形)的形状,这在许多实际应用中是不现实的,并且已经开发了几种工具(例如,PROPROC)来解决这个问题。然而,由于双正态ROC的一般稳健性,拟合曲线的不合适性对于关于全局汇总指标(例如ROC曲线下面积(AUC))的推断可能影响很小。在这项工作中,我们研究了当数据来自实际合适的曲线时,拟合的双正态ROC曲线的严重不合适性对AUC估计可靠性的影响。
我们设计了理论上合适的ROC场景,在存在分布良好的经验ROC点的情况下,这些场景会导致拟合的双正态曲线出现严重不合适的形状。使用最大似然法拟合双正态曲线。通过模拟,我们估计了严重不合适的拟合曲线的频率、估计AUC的偏差以及95%置信区间(CI)的覆盖率。在附录S1中,我们提供了关于估计部分AUC时AUC估计分布的百分位数和偏差的额外信息。我们还将结果与从连续数据获得的经验估计提供的参考标准进行了比较。
根据所讨论的场景,我们观察到高达96%的严重不合适曲线。双正态AUC估计中的偏差非常小,CI的覆盖率接近名义值,而部分AUC的估计在高特异性范围内向上偏差,在低特异性范围内向下偏差。与非参数方法相比,双正态模型导致的AUC估计略有更多变化,但同时CI的覆盖率更合适。
拟合的双正态曲线的不合适形状本身,即在存在足够数量分布良好的点的情况下,并不意味着基于AUC的推断不可靠。