Leiden Institute of Advanced Computer Science (LIACS), Leiden University, Leiden, The Netherlands.
Institute for Clinical Trials and Methodology, University College London, London, UK.
Clin Trials. 2022 Feb;19(1):14-21. doi: 10.1177/17407745211053790. Epub 2021 Oct 24.
The size of the margin strongly influences the required sample size in non-inferiority and equivalence trials. What is sometimes ignored, however, is that for trials with binary outcomes, the scale of the margin - risk difference, risk ratio or odds ratio - also has a large impact on power and thus on sample size requirement. When considering several scales at the design stage of a trial, these sample size consequences should be taken into account. Sometimes, changing the scale may be needed at a later stage of a trial, for example, when the event proportion in the control arm turns out different from expected. Also after completion of a trial, a switch to another scale is sometimes made, for example, when using a regression model in a secondary analysis or when combining study results in a meta-analysis that requires unifying scales. The exact consequences of such switches are currently unknown.
This article first outlines sample size consequences for different choices of analysis scale at the design stage of a trial. We add a new result on sample size requirement comparing the risk difference scale with the risk ratio scale. Then, we study two different approaches to changing the analysis scale after the trial has commenced: (1) mapping the original non-inferiority margin using the event proportion in the control arm that was anticipated at the design stage or (2) mapping the original non-inferiority margin using the observed event proportion in the control arm. We use simulations to illustrate consequences on type I and type II error rates. Methods are illustrated on the INES trial, a non-inferiority trial that compared single birth rates in subfertile couples after different fertility treatments. Our results demonstrate large differences in required sample size when choosing between risk difference, risk ratio and odds ratio scales at the design stage of non-inferiority trials. In some cases, the sample size requirement is twice as large on one scale compared with another. Changing the scale after commencing the trial using anticipated proportions mainly impacts type II error rate, whereas switching using observed proportions is not advised due to not maintaining type I error rate. Differences were more pronounced with larger margins.
Trialists should be aware that the analysis scale can have large impact on type I and type II error rates in non-inferiority trials.
在非劣效性和等效性试验中,边界(margin)的大小强烈影响所需的样本量。然而,有时被忽视的是,对于二分类结局的试验,边界的规模——风险差、风险比或优势比——也对功效有很大影响,进而对样本量要求有很大影响。在试验设计阶段考虑几个规模时,应该考虑到这些样本量后果。有时,在试验的后期阶段可能需要改变规模,例如,当对照臂的事件比例与预期不同时。在试验完成后,有时也会切换到另一个规模,例如,在二次分析中使用回归模型或在需要统一规模的荟萃分析中合并研究结果时。目前尚不清楚这种转换的具体后果。
本文首先概述了在试验设计阶段选择不同分析规模的样本量后果。我们添加了一个关于比较风险差规模与风险比规模的样本量要求的新结果。然后,我们研究了试验开始后改变分析规模的两种不同方法:(1)使用设计阶段预期的对照臂事件比例映射原始非劣效性边界;(2)使用对照臂中观察到的事件比例映射原始非劣效性边界。我们使用模拟来说明对 I 型和 II 型错误率的影响。方法在 INES 试验中得到了说明,该试验比较了不同生育治疗后不孕夫妇的单胎出生率。我们的结果表明,在非劣效性试验的设计阶段选择风险差、风险比和优势比规模时,所需样本量有很大差异。在某些情况下,一种规模的样本量需求是另一种规模的两倍。使用预期比例在试验开始后改变规模主要影响 II 型错误率,而使用观察到的比例切换不建议,因为这会导致 I 型错误率无法维持。较大的边界差异更为明显。
试验设计者应该意识到,分析规模可能对非劣效性试验中的 I 型和 II 型错误率有很大影响。