Department of Child Neurology, Brain Center Rudolf Magnus, University Medical Center Utrecht and Utrecht University, P.O. Box 85090, Utrecht 3508 AB, The Netherlands.
Department of Child Neurology, Brain Center Rudolf Magnus, University Medical Center Utrecht and Utrecht University, P.O. Box 85090, Utrecht 3508 AB, The Netherlands; Biomedical MR Imaging and Spectroscopy group, Center for Image Sciences, University Medical Center Utrecht and Utrecht University, Heidelberglaan 100, Utrecht 3584 CX, The Netherlands.
J Clin Epidemiol. 2018 Oct;102:123-128. doi: 10.1016/j.jclinepi.2018.06.014. Epub 2018 Jul 5.
To study the statistical power of randomized clinical trials and examine developments over time.
We analyzed the statistical power in 136,212 clinical trials between 1975 and 2014 extracted from meta-analyses from the Cochrane database of systematic reviews. We determined study power to detect standardized effect sizes, where power was based on the meta-analyzed effect size. Average power, effect size, and temporal patterns were examined for all meta-analyses and a subset of significant meta-analyses.
The number of trials with power ≥80% was low (7%) but increased over time: from 5% in 1975-1979 to 9% in 2010-2014. In significant meta-analyses, the proportion of trials with sufficient power increased from 9% to 15% in these years (median power increased from 16% to 23%). This increase was mainly due to increasing sample sizes, while effect sizes remained stable with a median Cohen's h of 0.09 (interquartile range 0.04-0.22) and a median Cohen's d of 0.20 (0.11-0.40).
This study demonstrates that sufficient power in clinical trials is still problematic, although the situation is slowly improving. Our data encourage further efforts to increase statistical power in clinical trials to guarantee rigorous and reproducible evidence-based medicine.
研究随机临床试验的统计功效,并考察其随时间的发展变化。
我们分析了 1975 年至 2014 年间从 Cochrane 系统评价数据库的荟萃分析中提取的 136212 项临床试验的统计功效。我们根据荟萃分析的效应大小来确定研究的检测功效,即检测标准化效应大小的能力。考察了所有荟萃分析和部分显著荟萃分析的平均功效、效应大小和时间模式。
功效≥80%的试验数量较少(7%),但随时间推移呈上升趋势:从 1975-1979 年的 5%上升至 2010-2014 年的 9%。在显著的荟萃分析中,具有足够功效的试验比例从这几年的 9%增加到 15%(中位数功效从 16%增加到 23%)。这种增加主要是由于样本量的增加,而效应大小保持稳定,Cohen's h 的中位数为 0.09(四分位距 0.04-0.22),Cohen's d 的中位数为 0.20(0.11-0.40)。
本研究表明,临床试验中足够的功效仍然是一个问题,尽管情况正在缓慢改善。我们的数据鼓励进一步努力提高临床试验的统计功效,以保证严谨和可重复的循证医学。