Fuyama Kanako, Sakamaki Kentaro, Uemura Kohei, Yokota Isao
Department of Biostatistics, Graduate School of Medicine, Hokkaido University, Sapporo, Japan.
Faculty of Health Data Science, Juntendo University, Tokyo, Japan.
Clin Trials. 2025 Jun;22(3):301-311. doi: 10.1177/17407745241304706. Epub 2025 Jan 3.
BackgroundIn randomized clinical trials, multiple-testing procedures, composite endpoints, and prioritized outcome approaches are increasingly used to analyze multiple binary outcomes. Previous studies have shown that correlations between outcomes influence their sample size requirements. Although sample size is an important factor affecting the choice of statistical methods, the power and required sample sizes of methods for analyzing multiple binary outcomes have yet to be compared under the influence of outcome correlations.MethodsWe conducted simulations to evaluate the power of co-primary and multiple primary endpoints, composite endpoints, and prioritized outcome approaches based on generalized pairwise comparisons with varying correlations, marginal proportions, treatment effects, and number of outcomes. We then conducted a case study on sample size using a clinical trial of a migraine treatment as an example.ResultsThe correlations significantly affected the statistical power and sample size of composite endpoints. The power and sample size of co-primary endpoints remained relatively stable across different correlations, though their power declined substantially when treatment effects were opposite on some components or more than two components were present. While the correlations influenced the power and sample size of all methods assessed, their direction and degree of influence varied between methods. Notably, the method with the greatest power and smallest sample size also differed depending on the correlations. When the correlations were the same between arms, prioritized outcome approaches usually had higher power and smaller sample sizes than other methods.ConclusionsAnticipated correlations and their uncertainty should be considered when selecting statistical methods. Overall, co-primary endpoints remain a reliable option for evaluating the superiority of all components, although they are unsuitable for assessing the balance between treatment effects pointing in different directions. Generalized pairwise comparisons offer a useful alternative to deal with multiple prioritized outcomes, often providing the smallest sample sizes when the correlation structures are shared between the arms.
背景
在随机临床试验中,多重检验程序、复合终点和优先结局方法越来越多地用于分析多个二元结局。先前的研究表明,结局之间的相关性会影响其样本量要求。尽管样本量是影响统计方法选择的一个重要因素,但在结局相关性的影响下,用于分析多个二元结局的方法的检验效能和所需样本量尚未得到比较。
方法
我们进行了模拟,以评估基于具有不同相关性、边际比例、治疗效果和结局数量的广义成对比较的共同主要终点和多个主要终点、复合终点及优先结局方法的检验效能。然后,我们以一项偏头痛治疗的临床试验为例进行了样本量的案例研究。
结果
相关性显著影响复合终点的统计检验效能和样本量。共同主要终点的检验效能和样本量在不同相关性下保持相对稳定,不过当某些组分的治疗效果相反或存在两个以上组分时,其检验效能会大幅下降。虽然相关性影响了所有评估方法的检验效能和样本量,但其影响的方向和程度在不同方法之间有所不同。值得注意的是,检验效能最高且样本量最小的方法也因相关性而异。当两组之间的相关性相同时,优先结局方法通常比其他方法具有更高的检验效能和更小的样本量。
结论
选择统计方法时应考虑预期的相关性及其不确定性。总体而言,共同主要终点仍然是评估所有组分优越性的可靠选择,尽管它们不适用于评估不同方向治疗效果之间的平衡。广义成对比较为处理多个优先结局提供了一种有用的替代方法,当两组之间共享相关结构时,通常能提供最小的样本量。