He Yaohua, Escobar Michael
Global Biometric Sciences, Bristol-Myers Squibb Company, Princeton, NJ, USA.
Stat Med. 2008 Nov 10;27(25):5291-308. doi: 10.1002/sim.3335.
Recently ROC50 index-the area under the lower portion of the receiver operating characteristic (ROC) curve up to the first 50 false positives-has been increasingly widely used in genomic research. Unfortunately, statistical inferences on the ROC50 index are not commonly drawn due to a lack of handy statistical inference methods and/or software tools. In this paper, we reviewed developments in statistical methods for the partial areas under ROC curves and using nonparametric methods we derived a simple and direct variance calculation formula for the partial areas, different from existing methods in the literature. We have also verified our method through simulation studies and compared our method with existing bi-normal approaches. We have shown that the partial area has an asymptotic normal distribution using trimmed U-statistics theory. On the basis of this asymptotic normality, we have given formulas for the confidence interval and the test statistic and we reported on their application to a genomic study of sample size approximately 10,000.
最近,ROC50指数(即接收者操作特征曲线(ROC)下部直至前50个假阳性的面积)在基因组研究中越来越广泛地被使用。不幸的是,由于缺乏便捷的统计推断方法和/或软件工具,关于ROC50指数的统计推断并不常见。在本文中,我们回顾了ROC曲线下部分面积的统计方法的发展,并使用非参数方法推导出了一个简单直接的部分面积方差计算公式,这与文献中的现有方法不同。我们还通过模拟研究验证了我们的方法,并将我们的方法与现有的双正态方法进行了比较。我们使用截尾U统计量理论表明部分面积具有渐近正态分布。基于这种渐近正态性,我们给出了置信区间和检验统计量的公式,并报告了它们在样本量约为10000的基因组研究中的应用。