Balzer Laura B, van der Laan Mark J, Petersen Maya L
Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, 02115, MA, U.S.A..
Division of Biostatistics, University of California, Berkeley, 94110-7358, CA, U.S.A.
Stat Med. 2016 Nov 10;35(25):4528-4545. doi: 10.1002/sim.7023. Epub 2016 Jul 19.
In randomized trials, adjustment for measured covariates during the analysis can reduce variance and increase power. To avoid misleading inference, the analysis plan must be pre-specified. However, it is often unclear a priori which baseline covariates (if any) should be adjusted for in the analysis. Consider, for example, the Sustainable East Africa Research in Community Health (SEARCH) trial for HIV prevention and treatment. There are 16 matched pairs of communities and many potential adjustment variables, including region, HIV prevalence, male circumcision coverage, and measures of community-level viral load. In this paper, we propose a rigorous procedure to data-adaptively select the adjustment set, which maximizes the efficiency of the analysis. Specifically, we use cross-validation to select from a pre-specified library the candidate targeted maximum likelihood estimator (TMLE) that minimizes the estimated variance. For further gains in precision, we also propose a collaborative procedure for estimating the known exposure mechanism. Our small sample simulations demonstrate the promise of the methodology to maximize study power, while maintaining nominal confidence interval coverage. We show how our procedure can be tailored to the scientific question (intervention effect for the study sample vs. for the target population) and study design (pair-matched or not). Copyright © 2016 John Wiley & Sons, Ltd.
在随机试验中,分析过程中对已测量的协变量进行调整可以减少方差并提高检验效能。为避免误导性推断,分析计划必须预先指定。然而,通常事先并不清楚在分析中应针对哪些基线协变量(如果有的话)进行调整。例如,考虑东非社区健康可持续研究(SEARCH)的艾滋病预防和治疗试验。有16对匹配的社区以及许多潜在的调整变量,包括地区、艾滋病毒流行率、男性包皮环切覆盖率以及社区层面病毒载量的测量指标。在本文中,我们提出了一种严格的程序来数据自适应地选择调整集,以最大化分析的效率。具体而言,我们使用交叉验证从预先指定的库中选择使估计方差最小的候选目标最大似然估计器(TMLE)。为了进一步提高精度,我们还提出了一种用于估计已知暴露机制的协作程序。我们的小样本模拟证明了该方法在最大化研究效能的同时保持名义置信区间覆盖率的前景。我们展示了如何根据科学问题(研究样本与目标人群的干预效果)和研究设计(是否配对匹配)对我们的程序进行调整。版权所有© 2016约翰威立父子有限公司。