Luschei Remi, Brannath Werner
Institute for Statistics and Competence Center for Clinical Trials, University of Bremen, Bremen, Germany.
Stat Methods Med Res. 2025 Feb;34(2):390-404. doi: 10.1177/09622802241307237. Epub 2025 Jan 19.
The population-wise error rate is a type I error rate for clinical trials with multiple target populations. In such trials, a treatment is tested for its efficacy in each population. The population-wise error rate is defined as the probability that a randomly selected, future patient will be exposed to an inefficient treatment based on the study results. It can be understood and computed as an average of strata-specific family wise error rates and involves the prevalences of these strata. A major issue of this concept is that the prevalences are usually unknown in practice, so that the population-wise error rate cannot be directly controlled. Instead, one could use an estimator based on the given sample, like their maximum-likelihood estimator under a multinomial distribution. In this article, we demonstrate through simulations that this does not substantially inflate the true population-wise error rate. We differentiate between the expected population-wise error rate, which is almost perfectly controlled, and study-specific values of the population-wise error rate which are conditioned on all subgroup sample sizes and vary within a narrow range. Thereby, we consider up to eight different overlapping populations and moderate to large sample sizes. In these settings, we also consider the maximum strata-wise family wise error rate, which is found to be, on average, at least bounded by twice the significance level used for population-wise error rate control.
总体错误率是针对具有多个目标人群的临床试验的I型错误率。在这类试验中,会在每个群体中测试一种治疗方法的疗效。总体错误率定义为基于研究结果,随机选择的未来患者接受无效治疗的概率。它可以理解并计算为各层特定的族系错误率的平均值,并且涉及这些层的患病率。这个概念的一个主要问题是,在实际中患病率通常是未知的,因此总体错误率无法直接控制。相反,可以使用基于给定样本的估计量,比如多项分布下的最大似然估计量。在本文中,我们通过模拟证明这不会大幅提高实际的总体错误率。我们区分了几乎能得到完美控制的预期总体错误率,以及基于所有亚组样本量的特定研究的总体错误率值,这些值在一个狭窄范围内变化。因此,我们考虑了多达八个不同的重叠群体以及中等到大的样本量。在这些情况下,我们还考虑了最大层特定族系错误率,发现其平均至少受用于总体错误率控制的显著性水平的两倍限制。