Huang Ying
Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA and Department of Biostatistics, University of Washington, Seattle, WA 98109, USA
Biostatistics. 2016 Jul;17(3):499-522. doi: 10.1093/biostatistics/kxw003. Epub 2016 Feb 16.
Two-phase sampling design, where biomarkers are subsampled from a phase-one cohort sample representative of the target population, has become the gold standard in biomarker evaluation. Many two-phase case-control studies involve biased sampling of cases and/or controls in the second phase. For example, controls are often frequency-matched to cases with respect to other covariates. Ignoring biased sampling of cases and/or controls can lead to biased inference regarding biomarkers' classification accuracy. Considering the problems of estimating and comparing the area under the receiver operating characteristics curve (AUC) for a binary disease outcome, the impact of biased sampling of cases and/or controls on inference and the strategy to efficiently account for the sampling scheme have not been well studied. In this project, we investigate the inverse-probability-weighted method to adjust for biased sampling in estimating and comparing AUC. Asymptotic properties of the estimator and its inference procedure are developed for both Bernoulli sampling and finite-population stratified sampling. In simulation studies, the weighted estimators provide valid inference for estimation and hypothesis testing, while the standard empirical estimators can generate invalid inference. We demonstrate the use of the analytical variance formula for optimizing sampling schemes in biomarker study design and the application of the proposed AUC estimators to examples in HIV vaccine research and prostate cancer research.
两阶段抽样设计,即从代表目标人群的第一阶段队列样本中对生物标志物进行二次抽样,已成为生物标志物评估的金标准。许多两阶段病例对照研究在第二阶段存在病例和/或对照的偏倚抽样。例如,对照通常在其他协变量方面与病例进行频率匹配。忽略病例和/或对照的偏倚抽样可能导致关于生物标志物分类准确性的推断有偏差。考虑到二元疾病结局的受试者工作特征曲线(AUC)下面积的估计和比较问题,病例和/或对照的偏倚抽样对推断的影响以及有效考虑抽样方案的策略尚未得到充分研究。在本项目中,我们研究逆概率加权法,以在估计和比较AUC时调整偏倚抽样。针对伯努利抽样和有限总体分层抽样,开发了估计量的渐近性质及其推断程序。在模拟研究中,加权估计量为估计和假设检验提供了有效的推断,而标准经验估计量可能产生无效推断。我们展示了使用分析方差公式在生物标志物研究设计中优化抽样方案,以及将所提出的AUC估计量应用于HIV疫苗研究和前列腺癌研究的实例。