Ruopp Marcus D, Perkins Neil J, Whitcomb Brian W, Schisterman Enrique F
Division of Epidemiology, Statistics and Prevention Research, National Institute of Child Health and Human Development, National Institutes of Health, DHHS, 6100 Executive Blvd, 7B03, Rockville Bethesda, MD, USA.
Biom J. 2008 Jun;50(3):419-30. doi: 10.1002/bimj.200710415.
The receiver operating characteristic (ROC) curve is used to evaluate a biomarker's ability for classifying disease status. The Youden Index (J), the maximum potential effectiveness of a biomarker, is a common summary measure of the ROC curve. In biomarker development, levels may be unquantifiable below a limit of detection (LOD) and missing from the overall dataset. Disregarding these observations may negatively bias the ROC curve and thus J. Several correction methods have been suggested for mean estimation and testing; however, little has been written about the ROC curve or its summary measures. We adapt non-parametric (empirical) and semi-parametric (ROC-GLM [generalized linear model]) methods and propose parametric methods (maximum likelihood (ML)) to estimate J and the optimal cut-point (c *) for a biomarker affected by a LOD. We develop unbiased estimators of J and c * via ML for normally and gamma distributed biomarkers. Alpha level confidence intervals are proposed using delta and bootstrap methods for the ML, semi-parametric, and non-parametric approaches respectively. Simulation studies are conducted over a range of distributional scenarios and sample sizes evaluating estimators' bias, root-mean square error, and coverage probability; the average bias was less than one percent for ML and GLM methods across scenarios and decreases with increased sample size. An example using polychlorinated biphenyl levels to classify women with and without endometriosis illustrates the potential benefits of these methods. We address the limitations and usefulness of each method in order to give researchers guidance in constructing appropriate estimates of biomarkers' true discriminating capabilities.
受试者工作特征(ROC)曲线用于评估生物标志物对疾病状态进行分类的能力。约登指数(J)是生物标志物的最大潜在效能,是ROC曲线的一种常见汇总指标。在生物标志物开发过程中,低于检测限(LOD)的水平可能无法量化,并且会在整个数据集中缺失。忽略这些观察结果可能会对ROC曲线产生负偏倚,从而影响约登指数。已经提出了几种用于均值估计和检验的校正方法;然而,关于ROC曲线或其汇总指标的文献却很少。我们采用非参数(经验)和半参数(ROC-广义线性模型[ROC-GLM])方法,并提出参数方法(最大似然[ML])来估计受LOD影响的生物标志物的J和约登指数以及最佳切点(c*)。我们通过ML为正态分布和伽马分布的生物标志物开发了J和c*的无偏估计量。分别针对ML、半参数和非参数方法,使用德尔塔法和自助法提出了α水平置信区间。在一系列分布场景和样本量下进行了模拟研究,评估估计量的偏差、均方根误差和覆盖概率;在各种场景下,ML和GLM方法的平均偏差均小于1%,并且随着样本量的增加而减小。一个使用多氯联苯水平对患有和未患有子宫内膜异位症的女性进行分类的例子说明了这些方法的潜在益处。我们阐述了每种方法的局限性和实用性,以便为研究人员在构建生物标志物真实鉴别能力的适当估计方面提供指导。