Department of Critical Care, Jordan Valley Medical Center, 3580 West 9000 South, West Jordan, Utah 84088, USA.
Crit Care. 2010;14(2):R77. doi: 10.1186/cc8990. Epub 2010 Apr 29.
Mortality is the most widely accepted outcome measure in randomized controlled trials of therapies for critically ill adults, but most of these trials fail to show a statistically significant mortality benefit. The reasons for this are unknown.
We searched five high impact journals (Annals of Internal Medicine, British Medical Journal, JAMA, The Lancet, New England Journal of Medicine) for randomized controlled trials comparing mortality of therapies for critically ill adults over a ten year period. We abstracted data on the statistical design and results of these trials to compare the predicted delta (delta; the effect size of the therapy compared to control expressed as an absolute mortality reduction) to the observed delta to determine if there is a systematic overestimation of predicted delta that might explain the high prevalence of negative results in these trials.
We found 38 trials meeting our inclusion criteria. Only 5/38 (13.2%) of the trials provided justification for the predicted delta. The mean predicted delta among the 38 trials was 10.1% and the mean observed delta was 1.4% (P < 0.0001), resulting in a delta-gap of 8.7%. In only 2/38 (5.3%) of the trials did the observed delta exceed the predicted delta and only 7/38 (18.4%) of the trials demonstrated statistically significant results in the hypothesized direction; these trials had smaller delta-gaps than the remainder of the trials (delta-gap 0.9% versus 10.5%; P < 0.0001). For trials showing non-significant trends toward benefit greater than 3%, large increases in sample size (380% - 1100%) would be required if repeat trials use the observed delta from the index trial as the predicted delta for a follow-up study.
Investigators of therapies for critical illness systematically overestimate treatment effect size (delta) during the design of randomized controlled trials. This bias, which we refer to as "delta inflation", is a potential reason that these trials have a high rate of negative results."Absence of evidence is not evidence of absence."
死亡率是评估危重症成人治疗效果最广泛接受的指标,但大多数此类试验未能显示出统计学意义上的死亡率获益。其原因尚不清楚。
我们在五个高影响力期刊(《内科学年鉴》《英国医学杂志》《美国医学会杂志》《柳叶刀》《新英格兰医学杂志》)中搜索了比较危重症成人治疗效果的十年间的随机对照试验。我们提取了这些试验的统计学设计和结果数据,以比较预测差值(delta;治疗效果与对照的差值,以绝对死亡率降低表示)与观察差值,以确定是否存在对预测差值的系统高估,这可能解释了这些试验中高比例的阴性结果。
我们发现符合纳入标准的试验有 38 项。只有 5/38(13.2%)的试验提供了预测 delta 的依据。38 项试验的平均预测 delta 为 10.1%,平均观察 delta 为 1.4%(P<0.0001),导致 delta 差距为 8.7%。只有 2/38(5.3%)的试验观察 delta 超过了预测 delta,只有 7/38(18.4%)的试验显示出了假设方向的统计学显著结果;这些试验的 delta 差距小于其余试验(delta 差距 0.9%对 10.5%;P<0.0001)。对于显示出大于 3%的有益趋势但无统计学意义的试验,如果重复试验将索引试验的观察 delta 作为后续研究的预测 delta,则需要大幅增加样本量(380%至 1100%)。
危重病治疗的研究人员在设计随机对照试验时系统地高估了治疗效果大小(delta)。这种偏差,我们称之为“delta 膨胀”,可能是这些试验高比例出现阴性结果的原因之一。“没有证据并不等于没有证据。”