School of Social and Community Medicine, University of Bristol, Bristol, UK.
Health Technol Assess. 2012 Sep;16(35):1-82. doi: 10.3310/hta16350.
The design of randomised controlled trials (RCTs) should incorporate characteristics (such as concealment of randomised allocation and blinding of participants and personnel) that avoid biases resulting from lack of comparability of the intervention and control groups. Empirical evidence suggests that the absence of such characteristics leads to biased intervention effect estimates, but the findings of different studies are not consistent.
To examine the influence of unclear or inadequate random sequence generation and allocation concealment, and unclear or absent double blinding, on intervention effect estimates and between-trial heterogeneity, and whether or not these influences vary with type of clinical area, intervention, comparison and outcome measure.
Data were combined from seven contributing meta-epidemiological studies (collections of meta-analyses in which trial characteristics are assessed and results recorded). The resulting database was used to identify and remove overlapping meta-analyses. Outcomes were coded such that odds ratios < 1 correspond to beneficial intervention effects. Outcome measures were classified as mortality, other objective or subjective. We examined agreement between assessments of trial characteristics in trials assessed in more than one contributing study. We used hierarchical Bayesian bias models to estimate the effect of trial characteristics on average bias [quantified as ratios of odds ratios (RORs) with 95% credible intervals (CrIs) comparing trials with and without a characteristic] and in increasing between-trial heterogeneity.
The analysis data set contained 1973 trials included in 234 meta-analyses. Median kappa statistics for agreement between assessments of trial characteristics were: sequence generation 0.60, allocation concealment 0.58 and blinding 0.87. Intervention effect estimates were exaggerated by an average 11% in trials with inadequate or unclear (compared with adequate) sequence generation (ROR 0.89, 95% CrI 0.82 to 0.96); between-trial heterogeneity was higher among such trials. Bias associated with inadequate or unclear sequence generation was greatest for subjective outcomes (ROR 0.83, 95% CrI 0.74 to 0.94) and the increase in heterogeneity was greatest for such outcomes [standard deviation (SD) 0.20, 95% CrI 0.03 to 0.32]. The effect of inadequate or unclear (compared with adequate) allocation concealment was greatest among meta-analyses with a subjectively assessed outcome intervention effect (ROR 0.85, 95% CrI 0.75 to 0.95), and the increase in between-trial heterogeneity was also greatest for such outcomes (SD 0.20, 95% CrI 0.02 to 0.33). Lack of, or unclear, double blinding (compared with double blinding) was associated with an average 13% exaggeration of intervention effects (ROR 0.87, 95% CrI 0.79 to 0.96), and between-trial heterogeneity was increased for such studies (SD 0.14, 95% CrI 0.02 to 0.30). Average bias (ROR 0.78, 95% CrI 0.65 to 0.92) and between-trial heterogeneity (SD 0.37, 95% CrI 0.19 to 0.53) were greatest for meta-analyses assessing subjective outcomes. Among meta-analyses with subjectively assessed outcomes, the effect of lack of blinding appeared greater than the effect of inadequate or unclear sequence generation or allocation concealment.
Bias associated with specific reported study design characteristics leads to exaggeration of beneficial intervention effect estimates and increases in between-trial heterogeneity. For each of the three characteristics assessed, these effects were greatest for subjectively assessed outcomes. Assessments of the risk of bias in RCTs should account for these findings. Further research is needed to understand the effects of attrition bias, as well as the relative importance of blinding of patients, care-givers and outcome assessors, and thus separate the effects of performance and detection bias.
National Institute for Health Research Health Technology Assessment programme.
随机对照试验(RCT)的设计应纳入一些特征(如随机分配的隐藏和参与者及人员的双盲),以避免因干预组和对照组缺乏可比性而导致的偏差。实证证据表明,缺乏这些特征会导致干预效果估计值出现偏差,但不同研究的结果并不一致。
研究不明确或不充分的随机序列生成和分配隐藏以及不明确或不存在的双盲对干预效果估计值和试验间异质性的影响,以及这些影响是否因临床领域、干预措施、比较和结局测量的不同而有所不同。
数据来自七个贡献的荟萃分析元流行病学研究(其中评估试验特征并记录结果的荟萃分析集合)。由此产生的数据库用于识别和删除重叠的荟萃分析。结局指标编码为比值比(OR)<1 对应于有益的干预效果。结局指标分为死亡率、其他客观或主观结局。我们检查了在多个贡献研究中评估的试验特征的评估之间的一致性。我们使用分层贝叶斯偏倚模型来估计试验特征对平均偏倚的影响(用比较有或没有特征的比值比(OR)的比值[95%可信区间(CrI)]来量化)和增加试验间异质性。
分析数据集包含 1973 项纳入 234 项荟萃分析的试验。评估试验特征的中位数kappa 统计量为:序列生成 0.60,分配隐藏 0.58,双盲 0.87。与适当的序列生成(OR 0.89,95%CrI 0.82 至 0.96)相比,不充分或不明确(与适当)的序列生成(OR 0.89,95%CrI 0.82 至 0.96)会使干预效果估计值平均夸大 11%;此类试验的试验间异质性更高。与不充分或不明确的序列生成相关的偏倚对于主观结局(OR 0.83,95%CrI 0.74 至 0.94)最大,并且这种异质性的增加最大[标准偏差(SD)0.20,95%CrI 0.03 至 0.32]。不充分或不明确(与适当)的分配隐藏(与适当)相比,对主观结局干预效果的荟萃分析(OR 0.85,95%CrI 0.75 至 0.95)的影响最大,并且这种异质性的增加也最大[SD 0.20,95%CrI 0.02 至 0.33]。缺乏或不明确的双盲(与双盲)相比,干预效果的平均夸大了 13%(OR 0.87,95%CrI 0.79 至 0.96),并且这种研究的试验间异质性增加(SD 0.14,95%CrI 0.02 至 0.30)。平均偏倚(OR 0.78,95%CrI 0.65 至 0.92)和试验间异质性(SD 0.37,95%CrI 0.19 至 0.53)最大的是评估主观结局的荟萃分析。在主观结局的荟萃分析中,缺乏盲法的影响似乎大于不充分或不明确的序列生成或分配隐藏的影响。
与特定报告的研究设计特征相关的偏倚会导致有益的干预效果估计值夸大,并增加试验间异质性。对于评估的三个特征中的每一个,主观评估的结局的影响最大。对 RCT 的偏倚风险的评估应考虑到这些发现。需要进一步研究以了解失访偏倚的影响,以及患者、护理人员和结局评估人员的盲法的相对重要性,从而将表现和检测偏倚的影响分开。
英国国家卫生与保健优化研究所卫生技术评估计划。