Wu Yougui
Department of Epidemiology and Biostatistics, College of Public Health, University of South Florida, Tampa, Florida, USA.
Stat Med. 2021 Feb 20;40(4):1059-1071. doi: 10.1002/sim.8819. Epub 2020 Nov 18.
Statistical methods are well developed for estimating the area under the receiver operating characteristic curve (AUC) based on a random sample where the gold standard is available for every subject in the sample, or a two-phase sample where the gold standard is ascertained only at the second phase for a subset of subjects sampled using fixed sampling probabilities. However, the methods based on a two-phase sample do not attempt to optimize the sampling probabilities to minimize the variance of AUC estimator. In this paper, we consider the optimal two-phase sampling design for evaluating the performance of an ordinal test in classifying disease status. We derived the analytic variance formula for the AUC estimator and used it to obtain the optimal sampling probabilities. The efficiency of the two-phase sampling under the optimal sampling probabilities (OA) is evaluated by a simulation study, which indicates that two-phase sampling under OA achieves a substantial amount of variance reduction with an over-sample of subjects with low and high ordinal levels, compared with two-phase sampling under proportional allocation (PA). Furthermore, in comparison with an one-phase random sampling, two-phase sampling under OA or PA have a clear advantage in reducing the variance of AUC estimator when the variance of diagnostic test results in the disease population is small relative to its counterpart in nondisease population. Finally, we applied the optimal two-phase sampling design to a real-world example to evaluate the performance of a questionnaire score in screening for childhood asthma.
基于随机样本估计受试者工作特征曲线(AUC)下面积的统计方法已经得到了很好的发展,在该随机样本中,样本中的每个受试者都有金标准可用,或者在两阶段样本中,对于使用固定抽样概率抽取的受试者子集,仅在第二阶段确定金标准。然而,基于两阶段样本的方法并未尝试优化抽样概率以最小化AUC估计量的方差。在本文中,我们考虑用于评估有序检验在疾病状态分类中性能的最优两阶段抽样设计。我们推导了AUC估计量的解析方差公式,并使用它来获得最优抽样概率。通过模拟研究评估了最优抽样概率(OA)下两阶段抽样的效率,结果表明,与按比例分配(PA)下的两阶段抽样相比,OA下的两阶段抽样通过对低序数水平和高序数水平的受试者进行过采样,实现了大量的方差减少。此外,与一阶段随机抽样相比,当疾病人群中诊断测试结果的方差相对于非疾病人群中的方差较小时,OA或PA下的两阶段抽样在降低AUC估计量的方差方面具有明显优势。最后,我们将最优两阶段抽样设计应用于一个实际例子,以评估问卷分数在儿童哮喘筛查中的性能。