Department of Medical Statistics, London School of Hygiene and Tropical Medicine, London, UK.
Statistical Innovation, Oncology R&D, AstraZeneca, AstraZeneca, Cambridge, UK.
Stat Methods Med Res. 2021 Jul;30(7):1725-1743. doi: 10.1177/09622802211017574. Epub 2021 Jun 2.
The number of Phase III trials that include a biomarker in design and analysis has increased due to interest in personalised medicine. For genetic mutations and other predictive biomarkers, the trial sample comprises two subgroups, one of which, say is known or suspected to achieve a larger treatment effect than the other . Despite treatment effect heterogeneity, trials often draw patients from both subgroups, since the lower responding subgroup may also gain benefit from the intervention. In this case, regulators/commissioners must decide what constitutes sufficient evidence to approve the drug in the population.
Assuming trial analysis can be completed using generalised linear models, we define and evaluate three frequentist decision rules for approval. For rule one, the significance of the average treatment effect in should exceed a pre-defined minimum value, say . For rule two, the data from the low-responding group should increase statistical significance. For rule three, the subgroup-treatment interaction should be non-significant, using type I error chosen to ensure that estimated difference between the two subgroup effects is acceptable. Rules are evaluated based on conditional power, given that there is an overall significant treatment effect. We show how different rules perform according to the distribution of patients across the two subgroups and when analyses include additional (stratification) covariates in the analysis, thereby conferring correlation between subgroup effects.
When additional conditions are required for approval of a new treatment in a lower response subgroup, easily applied rules based on minimum effect sizes and relaxed interaction tests are available. Choice of rule is influenced by the proportion of patients sampled from the two subgroups but less so by the correlation between subgroup effects.
由于对个性化医学的兴趣,在设计和分析中纳入生物标志物的 III 期试验数量有所增加。对于遗传突变和其他预测性生物标志物,试验样本由两个亚组组成,其中一个亚组,比如说,已知或疑似比另一个亚组有更大的治疗效果。尽管存在治疗效果异质性,但试验通常会从两个亚组中招募患者,因为反应较低的亚组也可能从干预中获益。在这种情况下,监管机构/决策者必须决定在何种程度上构成批准该药物在人群中的充分证据。
假设试验分析可以使用广义线性模型完成,我们定义并评估了三种批准的频率决策规则。对于规则一,在 中平均治疗效果的显著性应超过预先定义的最小 值,比如说 。对于规则二,低反应组 的数据应增加统计学意义。对于规则三,亚组治疗相互作用应无显著性,选择 I 型错误以确保估计的两个亚组效果之间的差异是可接受的。根据存在总体显著治疗效果的条件功效评估规则。我们展示了根据两个亚组中患者的分布以及当分析包括分析中的附加(分层)协变量时,不同规则的表现如何,从而在亚组效果之间产生相关性。
当需要在反应较低的亚组中批准新治疗时,需要额外的条件,那么可以使用基于最小效果量和放宽交互测试的易于应用的规则。规则的选择受从两个亚组中抽样的患者比例影响,但受亚组效果之间的相关性影响较小。