Division of Clinical Epidemiology, University Hospitals of Geneva and Faculty of Medicine, University of Geneva, Geneva, Switzerland.
BMC Med Res Methodol. 2010 Oct 15;10:93. doi: 10.1186/1471-2288-10-93.
The smallest difference to be detected in superiority trials or the largest difference to be ruled out in noninferiority trials is a key determinant of sample size, but little guidance exists to help researchers in their choice. The objectives were to examine the distribution of differences that researchers aim to detect in clinical trials and to verify that those differences are smaller in noninferiority compared to superiority trials.
Cross-sectional study based on a random sample of two hundred two-arm, parallel group superiority (100) and noninferiority (100) randomized clinical trials published between 2004 and 2009 in 27 leading medical journals. The main outcome measure was the smallest difference in favor of the new treatment to be detected (superiority trials) or largest unfavorable difference to be ruled out (noninferiority trials) used for sample size computation, expressed as standardized difference in proportions, or standardized difference in means. Student t test and analysis of variance were used.
The differences to be detected or ruled out varied considerably from one study to the next; e.g., for superiority trials, the standardized difference in means ranged from 0.007 to 0.87, and the standardized difference in proportions from 0.04 to 1.56. On average, superiority trials were designed to detect larger differences than noninferiority trials (standardized difference in proportions: mean 0.37 versus 0.27, P = 0.001; standardized difference in means: 0.56 versus 0.40, P = 0.006). Standardized differences were lower for mortality than for other outcomes, and lower in cardiovascular trials than in other research areas.
Superiority trials are designed to detect larger differences than noninferiority trials are designed to rule out. The variability between studies is considerable and is partly explained by the type of outcome and the medical context. A more explicit and rational approach to choosing the difference to be detected or to be ruled out in clinical trials may be desirable.
在优效性试验中,需要检测的最小差值或在非劣效性试验中需要排除的最大差值是确定样本量的关键决定因素,但目前几乎没有相关指导原则来帮助研究人员进行选择。本研究旨在检验临床试验中研究人员旨在检测的差值分布,并验证非劣效性试验中差值的大小确实劣于优效性试验。
这是一项基于 2004 年至 2009 年在 27 种主要医学期刊上发表的 202 项双臂平行组优效性(100 项)和非劣效性(100 项)随机临床试验的随机样本的横断面研究。主要结局指标是用于样本量计算的新治疗方法优势(优效性试验)或最大不利差异(非劣效性试验)的最小差值,以比例标准化差值或均数标准化差值表示。采用学生 t 检验和方差分析。
各项研究之间的差值差异很大;例如,优效性试验中,均数标准化差值范围为 0.007 至 0.87,比例标准化差值范围为 0.04 至 1.56。平均而言,优效性试验旨在检测大于非劣效性试验的差异(比例标准化差值:均值 0.37 与 0.27,P = 0.001;均数标准化差值:0.56 与 0.40,P = 0.006)。死亡率的标准化差值低于其他结局,心血管试验的标准化差值低于其他研究领域。
优效性试验旨在检测大于非劣效性试验旨在排除的差异。各研究间的差异较大,部分原因是结局类型和医学背景不同。在临床试验中,更明确和理性地选择需要检测或排除的差值可能是可取的。