Tsong Yi, Chen Wen-Jen
Office of Biostatistics/Office of Translational Sciences, CDER, US FDA, Silver Spring, MD 20993-0002, USA.
J Biopharm Stat. 2007;17(2):289-308. doi: 10.1080/10543400601177368.
In order to fulfill the requirement of a new drug application, a sponsor often need to conduct multiple clinical trials. Often these trials are of designs more complicated than a randomized two-sample single-factor study. For example, these trials could be designed with multiple centers, multiple factors, covariates, group sequential and/or adaptive scheme, etc. When an active standard treatment used as the control treatment in a two-arm clinical trial, the efficacy of the test treatment is often established by performing a noninferiority test through comparison of the test treatment and the active standard treatment. Typically, the noninferiority trials are designed with either a generalized historical control approach (i.e., noninferiority margin approach or delta-margin approach) or a cross-trial comparison approach (i.e., synthesis approach or lambda-margin approach). Many of the statistical properties of the approaches discussed in the literature were focused on testing in a simple two sample comparison form. We studied the limitations of the two approaches for the consideration of switching between superiority and noninferiority testing, feasibility to be applied with group sequential design, constancy assumption requirements, test dependency in multiple trials, analysis of homogeneity of efficacy among centers in a multi-center trial, data transformation and changing analysis method from the historical studies. Our evaluation shows that the cross-trial comparison approach is more restricted to simple two sample comparison with normal approximation test because of its poor properties with more complicated design and analysis. On the other hand, the generalized historical control comparison approach may have more flexible properties when the variability of the margin delta is indeed negligibly small.
为满足新药申请的要求,申办者通常需要进行多项临床试验。这些试验的设计往往比随机双样本单因素研究更为复杂。例如,这些试验可能设计有多个中心、多个因素、协变量、成组序贯和/或适应性方案等。在双臂临床试验中,当使用活性标准治疗作为对照治疗时,试验治疗的疗效通常通过对试验治疗和活性标准治疗进行比较的非劣效性检验来确定。通常,非劣效性试验采用广义历史对照法(即非劣效性界值法或δ界值法)或跨试验比较法(即综合法或λ界值法)进行设计。文献中讨论的这些方法的许多统计特性都集中在简单的双样本比较形式的检验上。我们研究了这两种方法在考虑优效性检验和非劣效性检验之间转换、应用成组序贯设计的可行性、恒定假设要求、多个试验中的检验依赖性、多中心试验中各中心疗效同质性分析、数据转换以及与历史研究相比改变分析方法等方面的局限性。我们的评估表明,跨试验比较法由于在更复杂的设计和分析中性能较差,更局限于采用正态近似检验的简单双样本比较。另一方面,当界值δ的变异性确实小到可以忽略不计时,广义历史对照比较法可能具有更灵活的特性。