Bantis Leonidas E, Nakas Christos T, Reiser Benjamin
Department of Statistics and Actuarial-Financial Mathematics, University of the Aegean, 83200 Samos, Greece.
Biometrics. 2014 Mar;70(1):212-23. doi: 10.1111/biom.12107. Epub 2013 Nov 21.
After establishing the utility of a continuous diagnostic marker investigators will typically address the question of determining a cut-off point which will be used for diagnostic purposes in clinical decision making. The most commonly used optimality criterion for cut-off point selection in the context of ROC curve analysis is the maximum of the Youden index. The pair of sensitivity and specificity proportions that correspond to the Youden index-based cut-off point characterize the performance of the diagnostic marker. Confidence intervals for sensitivity and specificity are routinely estimated based on the assumption that sensitivity and specificity are independent binomial proportions as they arise from the independent populations of diseased and healthy subjects, respectively. The Youden index-based cut-off point is estimated from the data and as such the resulting sensitivity and specificity proportions are in fact correlated. This correlation needs to be taken into account in order to calculate confidence intervals that result in the anticipated coverage. In this article we study parametric and non-parametric approaches for the construction of confidence intervals for the pair of sensitivity and specificity proportions that correspond to the Youden index-based optimal cut-off point. These approaches result in the anticipated coverage under different scenarios for the distributions of the healthy and diseased subjects. We find that a parametric approach based on a Box-Cox transformation to normality often works well. For biomarkers following more complex distributions a non-parametric procedure using logspline density estimation can be used.
在确定了连续诊断标志物的效用之后,研究人员通常会着手解决确定一个临界点的问题,该临界点将用于临床决策中的诊断目的。在ROC曲线分析的背景下,选择临界点最常用的最优性标准是尤登指数的最大值。与基于尤登指数的临界点相对应的灵敏度和特异度比例对表征了诊断标志物的性能。灵敏度和特异度的置信区间通常是基于灵敏度和特异度是独立二项比例的假设来估计的,因为它们分别来自患病和健康受试者的独立群体。基于尤登指数的临界点是根据数据估计出来的,因此由此产生的灵敏度和特异度比例实际上是相关的。为了计算出能达到预期覆盖范围的置信区间,需要考虑这种相关性。在本文中,我们研究了用于构建与基于尤登指数的最优临界点相对应的灵敏度和特异度比例对的置信区间的参数和非参数方法。这些方法在健康和患病受试者分布的不同情况下能达到预期的覆盖范围。我们发现基于Box-Cox正态变换的参数方法通常效果良好。对于遵循更复杂分布的生物标志物,可以使用基于对数样条密度估计的非参数程序。