The Division of General Internal Medicine, University of Alberta, Edmonton, Canada.
Int J Cardiol. 2013 Sep 30;168(2):1102-7. doi: 10.1016/j.ijcard.2012.11.048. Epub 2012 Dec 4.
Cochrane reviews are viewed as the gold standard in meta-analyses given their efforts to identify and limit systematic error which could cause spurious conclusions. The potential for random error to cause spurious conclusions in meta-analyses is less well appreciated.
We examined all reviews approved and published by the Cochrane Heart Group in the 2012 Cochrane Library that included at least one meta-analysis with 5 or more randomized trials. We used trial sequential analysis to classify statistically significant meta-analyses as true positives if their pooled sample size and/or their cumulative Z-curve crossed the O'Brien-Fleming monitoring boundaries for detecting a RRR of at least 25%. We classified meta-analyses that did not achieve statistical significance as true negatives if their pooled sample size was sufficient to reject a RRR of 25%.
Twenty three (41%) of the 56 meta-analyses reported statistically significant results, and 19 (83%) were true positives. Of the 33 non-statistically significant meta-analyses, 12 (36%) were true negatives. Overall, 25 (45%) of the 56 published Cochrane reviews were too small to detect/rule out an effect size of at least 25% - 12 were acknowledged as such by their authors. Of the 22 meta-analyses which were reported to be conclusive by their authors, 12 (55%) contained insufficient data to detect/rule out a 25% relative treatment effect.
False positive and false negative meta-analyses are common but infrequently recognized, even among methodologically robust reviews published by the Cochrane Heart Group. Meta-analysts and readers should incorporate trial sequential analysis when interpreting results.
鉴于 Cochrane 评价致力于识别和限制可能导致虚假结论的系统误差,因此被视为荟萃分析的金标准。然而,人们对随机误差导致荟萃分析产生虚假结论的可能性认识不足。
我们检查了 2012 年 Cochrane 图书馆中 Cochrane 心脏组批准和发表的所有纳入至少 5 项随机试验的荟萃分析的评价。我们使用试验序贯分析来对具有统计学意义的荟萃分析进行分类,如果其汇总样本量和/或累积 Z 曲线跨越了用于检测至少 25%相对风险降低的 O'Brien-Fleming 监测边界,则将其归类为真正的阳性;如果其汇总样本量足以拒绝 25%的相对风险降低,则将其归类为真正的阴性。
23 项(41%)56 项荟萃分析报告了统计学意义的结果,其中 19 项(83%)为真正的阳性。33 项非统计学意义的荟萃分析中,有 12 项(36%)为真正的阴性。总的来说,56 篇发表的 Cochrane 综述中有 25 篇(45%)太小,无法检测/排除至少 25%的效应大小 - 其中 12 篇被其作者承认。在 22 项被作者报告为结论性的荟萃分析中,有 12 项(55%)包含的数据不足以检测/排除 25%的相对治疗效果。
假阳性和假阴性荟萃分析很常见,但即使在 Cochrane 心脏组发表的方法学上稳健的综述中,也很少被认识到。荟萃分析人员和读者在解释结果时应结合试验序贯分析。