Warwick Medical School, University of Warwick, Coventry, UK.
Biostatistics and Data Sciences, Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach/Riss, Germany.
Pharm Stat. 2023 Jan;22(1):96-111. doi: 10.1002/pst.2262. Epub 2022 Aug 26.
Two significant pivotal trials are usually required for a new drug approval by a regulatory agency. This standard requirement is known as the two-trial paradigm. However, several authors have questioned why we need exactly two pivotal trials, what statistical error the regulators are trying to protect against, and potential alternative approaches. Therefore, it is important to investigate these questions to better understand the regulatory decision-making in the assessment of drugs' effectiveness. It is common that two identically designed trials are run solely to adhere to the two-trial rule. Previous work showed that combining the data from the two trials into a single trial (one-trial paradigm) would increase the power while ensuring the same level of type I error protection as the two-trial paradigm. However, this is true only under a specific scenario and there is little investigation on the type I error protection over the whole null region. In this article, we compare the two paradigms by considering scenarios in which the two trials are conducted in identical or different populations as well as with equal or unequal size. With identical populations, the results show that a single trial provides better type I error protection and higher power. Conversely, with different populations, although the one-trial rule is more powerful in some cases, it does not always protect against the type I error. Hence, there is the need for appropriate flexibility around the two-trial paradigm and the appropriate approach should be chosen based on the questions we are interested in.
通常需要两个关键性试验来获得监管机构对新药的批准。这一标准要求被称为两试验范式。然而,一些作者质疑为什么我们需要确切的两个关键性试验,监管机构试图针对什么统计错误进行保护,以及潜在的替代方法。因此,调查这些问题对于更好地理解监管机构在评估药物有效性方面的决策是很重要的。通常情况下,两个设计完全相同的试验只是为了遵守两试验规则而进行。之前的工作表明,将两个试验的数据合并到一个单独的试验(一试验范式)中会在确保与两试验范式相同的 I 型错误保护水平的同时增加功效。然而,这仅在特定情况下适用,对于整个零假设区域的 I 型错误保护几乎没有调查。在本文中,我们通过考虑两个试验在相同或不同人群中进行以及试验大小相等或不相等的情况来比较这两种范式。对于相同的人群,结果表明,一试验提供了更好的 I 型错误保护和更高的功效。相反,对于不同的人群,尽管一试验规则在某些情况下更有功效,但它并不总是能防止 I 型错误。因此,两试验范式需要有适当的灵活性,并且应该根据我们感兴趣的问题选择适当的方法。