Statistical Methodology and Consulting Center, Novartis Pharma AG, Basel, Switzerland.
Shanghai University of Finance and Economics, Shanghai, China.
Stat Med. 2018 May 10;37(10):1587-1607. doi: 10.1002/sim.7614. Epub 2018 Feb 20.
In a 2×2 crossover trial for establishing average bioequivalence (ABE) of a generic agent and a currently marketed drug, the recommended approach to hypothesis testing is the two one-sided test (TOST) procedure, which depends, among other things, on the estimated within-subject variability. The power of this procedure, and therefore the sample size required to achieve a minimum power, depends on having a good estimate of this variability. When there is uncertainty, it is advisable to plan the design in two stages, with an interim sample size reestimation after the first stage, using an interim estimate of the within-subject variability. One method and 3 variations of doing this were proposed by Potvin et al. Using simulation, the operating characteristics, including the empirical type I error rate, of the 4 variations (called Methods A, B, C, and D) were assessed by Potvin et al and Methods B and C were recommended. However, none of these 4 variations formally controls the type I error rate of falsely claiming ABE, even though the amount of inflation produced by Method C was considered acceptable. A major disadvantage of assessing type I error rate inflation using simulation is that unless all possible scenarios for the intended design and analysis are investigated, it is impossible to be sure that the type I error rate is controlled. Here, we propose an alternative, principled method of sample size reestimation that is guaranteed to control the type I error rate at any given significance level. This method uses a new version of the inverse-normal combination of p-values test, in conjunction with standard group sequential techniques, that is more robust to large deviations in initial assumptions regarding the variability of the pharmacokinetic endpoints. The sample size reestimation step is based on significance levels and power requirements that are conditional on the first-stage results. This necessitates a discussion and exploitation of the peculiar properties of the power curve of the TOST testing procedure. We illustrate our approach with an example based on a real ABE study and compare the operating characteristics of our proposed method with those of Method B of Povin et al.
在一项用于确定仿制药和当前市场药物平均生物等效性(ABE)的 2×2 交叉试验中,假设检验的推荐方法是双单边检验(TOST)程序,该程序除其他外,取决于个体内变异性的估计值。该程序的功效,以及实现最小功效所需的样本量,取决于对这种变异性的良好估计。当存在不确定性时,最好分两个阶段设计方案,在第一阶段后使用个体内变异性的中期估计值重新估计中期样本量。Potvin 等人提出了一种方法和 3 种变体来实现这一目标。Potvin 等人使用模拟评估了这 4 种变体(称为方法 A、B、C 和 D)的操作特征,包括经验性Ⅰ型错误率,推荐使用方法 B 和 C。然而,这 4 种变体都没有正式控制错误声称 ABE 的Ⅰ型错误率,尽管方法 C 产生的膨胀量被认为是可以接受的。使用模拟评估Ⅰ型错误率膨胀的一个主要缺点是,除非调查了拟议设计和分析的所有可能情况,否则无法确定Ⅰ型错误率得到了控制。在这里,我们提出了一种替代的、有原则的样本量重新估计方法,该方法可在任何给定的显著水平下保证控制Ⅰ型错误率。该方法使用新的逆正态组合 p 值检验版本,结合标准的群组序贯技术,在对药代动力学终点变异性的初始假设存在较大偏差时更稳健。样本量重新估计步骤基于第一阶段结果的置信水平和功效要求。这需要讨论和利用 TOST 检验程序功效曲线的特殊性质。我们通过一个基于真实 ABE 研究的例子来说明我们的方法,并将我们提出的方法与 Potvin 等人的方法 B 的操作特征进行比较。