Department of Clinical Epidemiology, Leiden University Medical Centre, RC Leiden, The Netherlands.
Int J Epidemiol. 2010 Dec;39(6):1567-81. doi: 10.1093/ije/dyq136. Epub 2010 Sep 13.
There is concern that non-inferiority trials might be deliberately designed to conceal that a new treatment is less effective than a standard treatment. In order to test this hypothesis we performed a meta-analysis of non-inferiority trials to assess the average effect of experimental treatments compared with standard treatments.
One hundred and seventy non-inferiority treatment trials published in 121 core clinical journals were included. The trials were identified through a search of PubMed (1991 to 20 February 2009). Combined relative risk (RR) from meta-analysis comparing experimental with standard treatments was the main outcome measure.
The 170 trials contributed a total of 175 independent comparisons of experimental with standard treatments. The combined RR for all 175 comparisons was 0.994 [95% confidence interval (CI) 0.978-1.010] using a random-effects model and 1.002 (95% CI 0.996-1.008) using a fixed-effects model. Of the 175 comparisons, experimental treatment was considered to be non-inferior in 130 (74%). The combined RR for these 130 comparisons was 0.995 (95% CI 0.983-1.006) and the point estimate favoured the experimental treatment in 58% (n = 76) and standard treatment in 42% (n = 54). The median non-inferiority margin (RR) pre-specified by trialists was 1.31 [inter-quartile range (IQR) 1.18-1.59].
In this meta-analysis of non-inferiority trials the average RR comparing experimental with standard treatments was close to 1. The experimental treatments that gain a verdict of non-inferiority in published trials do not appear to be systematically less effective than the standard treatments. Importantly, publication bias and bias in the design and reporting of the studies cannot be ruled out and may have skewed the study results in favour of the experimental treatments. Further studies are required to examine the importance of such bias.
人们担心非劣效性试验可能会被故意设计成隐瞒新疗法不如标准疗法有效。为了验证这一假说,我们对非劣效性试验进行了荟萃分析,以评估实验性治疗与标准治疗相比的平均效果。
共纳入 121 种核心临床期刊发表的 170 项非劣效性治疗试验。通过对 PubMed(1991 年至 2009 年 2 月 20 日)的检索,确定了这些试验。荟萃分析中比较实验性与标准治疗的合并相对风险(RR)是主要的观察指标。
170 项试验共提供了 175 项实验性与标准治疗的独立比较。采用随机效应模型,175 项比较的合并 RR 为 0.994(95%可信区间(CI)0.978-1.010),采用固定效应模型则为 1.002(95%CI 0.996-1.008)。在 175 项比较中,有 130 项(74%)认为实验性治疗具有非劣效性。这 130 项比较的合并 RR 为 0.995(95%CI 0.983-1.006),58%(n=76)的试验点估计值有利于实验性治疗,42%(n=54)的试验点估计值有利于标准治疗。试验设计者预先指定的非劣效性边界(RR)中位数为 1.31(四分位距(IQR)1.18-1.59)。
在这项非劣效性试验的荟萃分析中,实验性治疗与标准治疗相比的平均 RR 接近 1。在已发表的试验中获得非劣效性判决的实验性治疗似乎并不比标准治疗更无效。重要的是,不能排除发表偏倚和研究设计与报告中的偏倚,这些偏倚可能使研究结果偏向实验性治疗。需要进一步研究以检验这种偏倚的重要性。