Tong Guangyu, Nevins Pascale, Ryan Mary, Davis-Plourde Kendra, Ouyang Yongdong, Pereira Macedo Jules Antoine, Meng Can, Wang Xueqi, Caille Agnès, Li Fan, Taljaard Monica
Section of Cardiovascular Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT, USA.
Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA.
Clin Trials. 2025 Feb;22(1):45-56. doi: 10.1177/17407745241276137. Epub 2024 Oct 8.
BACKGROUND/AIMS: Stepped-wedge cluster randomized trials tend to require fewer clusters than standard parallel-arm designs due to the switches between control and intervention conditions, but there are no recommendations for the minimum number of clusters. Trials randomizing an extremely small number of clusters are not uncommon, but the justification for small numbers of clusters is often unclear and appropriate analysis is often lacking. In addition, stepped-wedge cluster randomized trials are methodologically more complex due to their longitudinal correlation structure, and ignoring the distinct within- and between-period intracluster correlations can underestimate the sample size in small stepped-wedge cluster randomized trials. We conducted a review of published small stepped-wedge cluster randomized trials to understand how and why they are used, and to characterize approaches used in their design and analysis.
Electronic searches were used to identify primary reports of full-scale stepped-wedge cluster randomized trials published during the period 2016-2022; the subset that randomized two to six clusters was identified. Two reviewers independently extracted information from each report and any available protocol. Disagreements were resolved through discussion.
We identified 61 stepped-wedge cluster randomized trials that randomized two to six clusters: median sample size (Q1-Q3) 1426 (420-7553) participants. Twelve (19.7%) gave some indication that the evaluation was considered a "preliminary" evaluation and 16 (26.2%) recognized the small number of clusters as a limitation. Sixteen (26.2%) provided an explanation for the limited number of clusters: the need to minimize contamination (e.g. by merging adjacent units), limited availability of clusters, and logistical considerations were common explanations. Majority (51, 83.6%) presented sample size or power calculations, but only one assumed distinct within- and between-period intracluster correlations. Few (10, 16.4%) utilized restricted randomization methods; more than half (34, 55.7%) identified baseline imbalances. The most common statistical method for analysis was the generalized linear mixed model (44, 72.1%). Only four trials (6.6%) reported statistical analyses considering small numbers of clusters: one used generalized estimating equations with small-sample correction, two used generalized linear mixed model with small-sample correction, and one used Bayesian analysis. Another eight (13.1%) used fixed-effects regression, the performance of which requires further evaluation under stepped-wedge cluster randomized trials with small numbers of clusters. None used permutation tests or cluster-period level analysis.
Methods appropriate for the design and analysis of small stepped-wedge cluster randomized trials have not been widely adopted in practice. Greater awareness is required that the use of standard sample size calculation methods can provide spuriously low numbers of required clusters. Methods such as generalized estimating equations or generalized linear mixed models with small-sample corrections, Bayesian approaches, and permutation tests may be more appropriate for the analysis of small stepped-wedge cluster randomized trials. Future research is needed to establish best practices for stepped-wedge cluster randomized trials with a small number of clusters.
背景/目的:由于对照和干预条件之间的转换,阶梯楔形整群随机试验往往比标准平行组设计需要更少的群组,但对于群组的最小数量没有相关建议。随机分配极少群组的试验并不罕见,但群组数量少的理由往往不明确,且常常缺乏适当的分析。此外,由于其纵向相关结构,阶梯楔形整群随机试验在方法上更为复杂,而忽略不同时期内和时期间的群组内相关性可能会低估小型阶梯楔形整群随机试验的样本量。我们对已发表的小型阶梯楔形整群随机试验进行了综述,以了解它们的使用方式和原因,并描述其设计和分析中所采用的方法。
通过电子检索来识别2016年至2022年期间发表的全面阶梯楔形整群随机试验的主要报告;确定随机分配两至六个群组的子集。两名评审员独立从每份报告及任何可用的方案中提取信息。通过讨论解决分歧。
我们识别出61项随机分配两至六个群组的阶梯楔形整群随机试验:样本量中位数(第一四分位数 - 第三四分位数)为1426(420 - 7553)名参与者。十二项试验(19.7%)给出了一些迹象表明该评估被视为“初步”评估,十六项试验(26.2%)认识到群组数量少是一个局限性。十六项试验(26.2%)对群组数量有限给出了解释:常见的解释包括需要尽量减少污染(例如通过合并相邻单位)、群组可用性有限以及后勤方面的考虑。大多数试验(51项,83.6%)给出了样本量或效能计算,但只有一项试验考虑了不同时期内和时期间的群组内相关性。很少有试验(10项,16.4%)采用受限随机化方法;超过一半(34项,55.7%)识别出基线不平衡。最常用的统计分析方法是广义线性混合模型(44项,72.1%)。只有四项试验(6.6%)报告了考虑少量群组的统计分析:一项使用了带有小样本校正的广义估计方程,两项使用了带有小样本校正的广义线性混合模型,一项使用了贝叶斯分析。另外八项试验(13.1%)使用了固定效应回归,其性能在少量群组的阶梯楔形整群随机试验中需要进一步评估。没有试验使用置换检验或群组 - 时期水平分析。
适用于小型阶梯楔形整群随机试验设计和分析的方法在实践中尚未得到广泛采用。需要更清楚地认识到使用标准样本量计算方法可能会得出虚假的低所需群组数量。诸如带有小样本校正的广义估计方程或广义线性混合模型、贝叶斯方法以及置换检验等方法可能更适合于小型阶梯楔形整群随机试验的分析。未来需要开展研究以确立针对少量群组的阶梯楔形整群随机试验的最佳实践。