Genentech Inc., South San Francisco, California, USA.
School of Public Health, Biostatistics, University of California Berkeley, Berkeley, California, USA.
Stat Med. 2023 Aug 30;42(19):3443-3466. doi: 10.1002/sim.9813. Epub 2023 Jun 12.
Across research disciplines, cluster randomized trials (CRTs) are commonly implemented to evaluate interventions delivered to groups of participants, such as communities and clinics. Despite advances in the design and analysis of CRTs, several challenges remain. First, there are many possible ways to specify the causal effect of interest (eg, at the individual-level or at the cluster-level). Second, the theoretical and practical performance of common methods for CRT analysis remain poorly understood. Here, we present a general framework to formally define an array of causal effects in terms of summary measures of counterfactual outcomes. Next, we provide a comprehensive overview of CRT estimators, including the t-test, generalized estimating equations (GEE), augmented-GEE, and targeted maximum likelihood estimation (TMLE). Using finite sample simulations, we illustrate the practical performance of these estimators for different causal effects and when, as commonly occurs, there are limited numbers of clusters of different sizes. Finally, our application to data from the Preterm Birth Initiative (PTBi) study demonstrates the real-world impact of varying cluster sizes and targeting effects at the cluster-level or at the individual-level. Specifically, the relative effect of the PTBi intervention was 0.81 at the cluster-level, corresponding to a 19% reduction in outcome incidence, and was 0.66 at the individual-level, corresponding to a 34% reduction in outcome risk. Given its flexibility to estimate a variety of user-specified effects and ability to adaptively adjust for covariates for precision gains while maintaining Type-I error control, we conclude TMLE is a promising tool for CRT analysis.
在各个研究领域中,经常采用整群随机试验 (cluster randomized trials, CRTs) 来评估针对参与者群体(如社区和诊所)实施的干预措施。尽管 CRTs 的设计和分析方法取得了进展,但仍存在一些挑战。首先,有许多可能的方法可以指定感兴趣的因果效应(例如,在个体水平或在群组水平)。其次,常见 CRT 分析方法的理论和实际性能仍未得到很好的理解。在这里,我们提出了一个通用框架,以根据反事实结果的汇总指标正式定义一系列因果效应。接下来,我们全面概述了 CRT 估计量,包括 t 检验、广义估计方程 (generalized estimating equations, GEE)、增强 GEE 和靶向最大似然估计 (targeted maximum likelihood estimation, TMLE)。通过有限样本模拟,我们说明了这些估计量在不同因果效应下的实际性能,以及当常见情况是存在不同大小的有限数量群组时的性能。最后,我们将其应用于来自早产倡议 (Preterm Birth Initiative, PTBi) 研究的数据,展示了不同群组大小和在群组水平或个体水平上靶向效应的实际影响。具体来说,PTBi 干预的相对效果在群组水平上为 0.81,对应于结局发生率降低 19%,在个体水平上为 0.66,对应于结局风险降低 34%。鉴于其能够估计各种用户指定的效应的灵活性,以及能够自适应地调整协变量以提高精度同时保持 I 型错误控制的能力,我们得出结论,TMLE 是 CRT 分析的一种有前途的工具。