Weaver Chris S, Leonardi-Bee Jo, Bath-Hextall Fiona J, Bath Philip M W
Institute of Neuroscience, University of Nottingham, UK.
Stroke. 2004 May;35(5):1216-24. doi: 10.1161/01.STR.0000125010.70652.93. Epub 2004 Mar 18.
Only a few randomized controlled trials in acute stroke have shown a treatment-related benefit. Inadequate trial design, especially low sample size, may partly explain this failure. We investigated sample size calculations (SSCs) in a systematic review of acute stroke trials.
Full reports of nonconfounded randomized controlled trials that recruited patients within 1 week of stroke onset and were published before the end of 2001 were identified from the Cochrane Library and other bibliographic databases. Information on the SSC and outcome event rates was collected for each trial.
Of 189 identified trial reports, 57 (30%) reported > or =1 components of the SSC, phase II 14/129 (11%) versus phase III 43/60 (72%) (P<0.001), with 32 (56%) giving all the required parameters. Significance (alpha) was mentioned in 54 (96%) reports; 53 used a significance level of alpha=0.05. And 55 (98%) reports gave the power (1-beta) of the study (median [25th and 75th percentile] 0.80 [0.80, 0.90]). The anticipated percentage of control subjects having a primary outcome event was given in 24 (42%) articles: case fatality 21.8% (11.8%, 23.5%, n=4) and combined death or disability/dependency 55.5% (44.5%, 66.3%, n=20); 25 studies used other outcomes and 8 studies gave insufficient information. Four of the 22 trials achieved a control rate within 5% of their prediction. 49 (86%) reports gave the anticipated treatment effect; case fatality: anticipated 9.5% (1.1%, 12.5%, n=6), achieved -0.3% (-4.1%, +2.4%); combined death or disability/dependency: anticipated 13.0% (10.0%, 16.0%, n=25), achieved 1.8% (-0.5%, +5.4%). The median calculated sample size was 600 (198, 995, n=54).
Too few trial publications report the assumptions underlying their SSC. Most trials were underpowered, ie, power <0.90, used inappropriate assumptions for event rates, and were grossly overoptimistic in their expectation of treatment effect. These deficiencies will together have resulted in trials being far too small and reduced their chance of being able to detect real treatment effects.
仅有少数急性卒中的随机对照试验显示出与治疗相关的益处。试验设计不充分,尤其是样本量过小,可能是导致这种失败的部分原因。我们在一项急性卒中试验的系统评价中研究了样本量计算(SSC)。
从Cochrane图书馆和其他书目数据库中识别出在卒中发作1周内招募患者且于2001年底前发表的非混淆随机对照试验的完整报告。收集每个试验的SSC和结局事件发生率信息。
在189份识别出的试验报告中,57份(30%)报告了SSC的≥1个组成部分,II期试验为14/129(11%),而III期试验为43/60(72%)(P<0.001),其中32份(56%)给出了所有所需参数。54份(96%)报告提及了显著性水平(α);53份使用的显著性水平为α=0.05。55份(98%)报告给出了研究的检验效能(1-β)(中位数[第25和第75百分位数]为0.80[0.80,0.90])。24份(42%)文章给出了对照组发生主要结局事件的预期百分比:病死率为21.8%(11.8%,23.5%,n=4),死亡或残疾/依赖合并发生率为55.5%(44.5%,66.3%,n=20);25项研究使用了其他结局,8项研究提供的信息不足。22项试验中有4项的对照组发生率在其预测值的5%以内。49份(86%)报告给出了预期的治疗效果;病死率:预期为9.5%(1.1%,12.5%,n=6),实际为-0.3%(-4.1%,+2.4%);死亡或残疾/依赖合并发生率:预期为13.0%(10.0%,16.0%,n=25),实际为1.8%(-0.5%,+5.4%)。计算出的样本量中位数为600(198,995,n=54)。
太少的试验出版物报告了其SSC背后的假设。大多数试验检验效能不足,即检验效能<0.90,对事件发生率使用了不恰当的假设,并且对治疗效果的预期过于乐观。这些缺陷共同导致试验规模过小,降低了其检测真实治疗效果的机会。