van der Laan Mark J, Balzer Laura B, Petersen Maya L
Division of Biostatistics, University of California, Berkeley.
J Stat Res. 2012 Dec 1;46(2):113-156.
In many randomized and observational studies the allocation of treatment among a sample of independent and identically distributed units is a function of the covariates of all sampled units. As a result, the treatment labels among the units are possibly dependent, complicating estimation and posing challenges for statistical inference. For example, cluster randomized trials frequently sample communities from some target population, construct matched pairs of communities from those included in the sample based on some metric of similarity in baseline community characteristics, and then randomly allocate a treatment and a control intervention within each matched pair. In this case, the observed data can neither be represented as the realization of independent random variables, nor, contrary to current practice, as the realization of /2 independent random variables (treating the matched pair as the independent sampling unit). In this paper we study estimation of the average causal effect of a treatment under experimental designs in which treatment allocation potentially depends on the pre-intervention covariates of all units included in the sample. We define efficient targeted minimum loss based estimators for this general design, present a theorem that establishes the desired asymptotic normality of these estimators and allows for asymptotically valid statistical inference, and discuss implementation of these estimators. We further investigate the relative asymptotic efficiency of this design compared with a design in which unit-specific treatment assignment depends only on the units' covariates. Our findings have practical implications for the optimal design and analysis of pair matched cluster randomized trials, as well as for observational studies in which treatment decisions may depend on characteristics of the entire sample.
在许多随机和观察性研究中,在一组独立同分布的单位样本中进行治疗分配是所有抽样单位协变量的函数。因此,各单位之间的治疗标签可能相互依赖,这使得估计变得复杂,并给统计推断带来挑战。例如,整群随机试验经常从某个目标人群中抽取社区样本,根据基线社区特征的某种相似性指标,从样本中包含的社区构建匹配对,然后在每个匹配对内随机分配一种治疗和一种对照干预。在这种情况下,观察到的数据既不能表示为独立随机变量的实现,也不能像当前做法那样,表示为/2个独立随机变量的实现(将匹配对视为独立抽样单位)。在本文中,我们研究了在实验设计下治疗的平均因果效应的估计,在这种设计中,治疗分配可能取决于样本中所有单位的干预前协变量。我们为这种一般设计定义了基于有效目标最小损失的估计量,给出了一个定理,该定理确立了这些估计量所需的渐近正态性,并允许进行渐近有效的统计推断,并讨论了这些估计量的实现。我们进一步研究了这种设计与单位特定治疗分配仅取决于单位协变量的设计相比的相对渐近效率。我们的发现对配对匹配整群随机试验的最优设计和分析以及治疗决策可能取决于整个样本特征的观察性研究具有实际意义。