Kang Le, Xiong Chengjie, Tian Lili
Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, MD 20993, United States.
Division of Biostatistics, Washington University in St. Louis, St. Louis, MO 63110, United States.
Comput Stat Data Anal. 2013 Dec;68. doi: 10.1016/j.csda.2013.07.007.
With three ordinal diagnostic categories, the most commonly used measures for the overall diagnostic accuracy are the volume under the ROC surface (VUS) and partial volume under the ROC surface (PVUS), which are the extensions of the area under the ROC curve (AUC) and partial area under the ROC curve (PAUC), respectively. A gold standard (GS) test on the true disease status is required to estimate the VUS and PVUS. However, oftentimes it may be difficult, inappropriate, or impossible to have a GS because of misclassification error, risk to the subjects or ethical concerns. Therefore, in many medical research studies, the true disease status may remain unobservable. Under the normality assumption, a maximum likelihood (ML) based approach using the expectation-maximization (EM) algorithm for parameter estimation is proposed. Three methods using the concepts of generalized pivot and parametric/nonparametric bootstrap for confidence interval estimation of the difference in paired VUSs and PVUSs without a GS are compared. The coverage probabilities of the investigated approaches are numerically studied. The proposed approaches are then applied to a real data set of 118 subjects from a cohort study in early stage Alzheimer's disease (AD) from the Washington University Knight Alzheimer's Disease Research Center to compare the overall diagnostic accuracy of early stage AD between two different pairs of neuropsychological tests.
对于三个有序诊断类别,用于评估总体诊断准确性的最常用指标是ROC曲面下体积(VUS)和ROC曲面下部分体积(PVUS),它们分别是ROC曲线下面积(AUC)和ROC曲线下部分面积(PAUC)的扩展。为了估计VUS和PVUS,需要对真实疾病状态进行金标准(GS)检测。然而,由于存在错误分类误差、对受试者的风险或伦理问题,往往很难、不适合或不可能进行金标准检测。因此,在许多医学研究中,真实疾病状态可能无法观察到。在正态性假设下,提出了一种基于最大似然(ML)的方法,使用期望最大化(EM)算法进行参数估计。比较了三种利用广义枢轴概念以及参数化/非参数化自助法来估计无金标准情况下配对VUS和PVUS差异的置信区间的方法。对所研究方法的覆盖概率进行了数值研究。然后将所提出的方法应用于来自华盛顿大学奈特阿尔茨海默病研究中心的一项早期阿尔茨海默病(AD)队列研究的118名受试者的真实数据集,以比较两种不同的神经心理学测试对之间早期AD的总体诊断准确性。