Thall P, Fox P, Wathen J
Department of Biostatistics, U.T. M.D. Anderson Cancer Center, Houston
Department of Biostatistics, U.T. M.D. Anderson Cancer Center, Houston.
Ann Oncol. 2015 Aug;26(8):1621-8. doi: 10.1093/annonc/mdv238. Epub 2015 May 15.
In recent years, various outcome adaptive randomization (AR) methods have been used to conduct comparative clinical trials. Rather than randomizing patients equally between treatments, outcome AR uses the accumulating data to unbalance the randomization probabilities in favor of the treatment arm that currently is superior empirically. This is motivated by the idea that, on average, more patients in the trial will be given the treatment that is truly superior, so AR is ethically more desirable than equal randomization. AR remains controversial, however, and some of its properties are not well understood by the clinical trials community.
Computer simulation was used to evaluate properties of a 200-patient clinical trial conducted using one of four Bayesian AR methods and compare them to an equally randomized group sequential design.
Outcome AR has several undesirable properties. These include a high probability of a sample size imbalance in the wrong direction, which might be surprising to nonstatisticians, wherein many more patients are assigned to the inferior treatment arm, the opposite of the intended effect. Compared with an equally randomized design, outcome AR produces less reliable final inferences, including a greatly overestimated actual treatment effect difference and smaller power to detect a treatment difference. This estimation bias becomes much larger if the prognosis of the accrued patients either improves or worsens systematically during the trial.
AR produces inferential problems that decrease potential benefit to future patients, and may decrease benefit to patients enrolled in the trial. These problems should be weighed against its putative ethical benefit. For randomized comparative trials to obtain confirmatory comparisons, designs with fixed randomization probabilities and group sequential decision rules appear to be preferable to AR, scientifically, and ethically.
近年来,各种结果适应性随机化(AR)方法已被用于进行比较性临床试验。结果适应性随机化不是在治疗组之间平等地随机分配患者,而是利用累积数据来使随机化概率失衡,以有利于目前在经验上更优的治疗组。这样做的动机是,平均而言,试验中的更多患者将接受真正更优的治疗,因此从伦理角度来看,结果适应性随机化比平等随机化更可取。然而,结果适应性随机化仍然存在争议,并且临床试验界对其一些特性尚未完全理解。
使用计算机模拟来评估采用四种贝叶斯结果适应性随机化方法之一进行的200例患者临床试验的特性,并将其与平等随机化的成组序贯设计进行比较。
结果适应性随机化有几个不良特性。这些特性包括样本量在错误方向上失衡的高概率,这可能会让非统计学家感到惊讶,即更多患者被分配到较差的治疗组,这与预期效果相反。与平等随机化设计相比,结果适应性随机化产生的最终推断可靠性较低,包括实际治疗效果差异被大大高估,以及检测治疗差异的效能较小。如果在试验期间累积患者的预后系统性地改善或恶化,这种估计偏差会变得更大。
结果适应性随机化会产生推断问题,这会降低对未来患者的潜在益处,并且可能会减少对参与试验患者的益处。这些问题应与其假定的伦理益处相权衡。对于获得确证性比较的随机对照试验,从科学和伦理角度来看,具有固定随机化概率和成组序贯决策规则的设计似乎比结果适应性随机化更可取。