Leon Andrew C
Department of Psychiatry, Weill Medical College of Cornell University, New York, NY 10021, USA.
J Clin Psychiatry. 2004 Nov;65(11):1511-4.
A researcher must carefully balance the risk of 2 undesirable outcomes when designing a clinical trial: false-positive results (type I error) and false-negative results (type II error). In planning the study, careful attention is routinely paid to statistical power (i.e., the complement of type II error) and corresponding sample size requirements. However, Bonferroni-type alpha adjustments to protect against type I error for multiple tests are often resisted. Here, a simple strategy is described that adjusts alpha for multiple primary efficacy measures, yet maintains statistical power for each test.
To illustrate the approach, multiplicity-adjusted sample size requirements were estimated for effects of various magnitude with statistical power analyses for 2-tailed comparisons of 2 groups using chi2 tests and t tests. These analyses estimated the required sample size for hypothetical clinical trial protocols in which the prespecified number of primary efficacy measures ranged from 1 to 5. Corresponding Bonferroni-adjusted alpha levels were used for these calculations.
Relative to that required for 1 test, the sample size increased by about 20% for 2 dependent variables and 30% for 3 dependent variables.
The strategy described adjusts alpha for multiple primary efficacy measures and, in turn, modifies the sample size to maintain statistical power. Although the strategy is not novel, it is typically overlooked in psychopharmacology trials. The number of primary efficacy measures must be prespecified and carefully limited when a clinical trial protocol is prepared. If multiple tests are designated in the protocol, the alpha-level adjustment should be anticipated and incorporated in sample size calculations.
在设计一项临床试验时,研究人员必须谨慎权衡两种不良结果的风险:假阳性结果(I型错误)和假阴性结果(II型错误)。在规划研究时,通常会仔细关注统计效能(即II型错误的互补值)和相应的样本量要求。然而,针对多次检验为防止I型错误而进行的邦费罗尼(Bonferroni)式α调整常常遭到抵制。在此,描述了一种简单的策略,该策略针对多个主要疗效指标调整α值,同时保持每次检验的统计效能。
为说明该方法,通过卡方检验和t检验对两组进行双尾比较的统计效能分析,估计了不同效应大小的多重性调整样本量要求。这些分析估计了假设临床试验方案所需的样本量,其中预先指定的主要疗效指标数量从1到5不等。这些计算使用了相应的邦费罗尼调整α水平。
相对于一次检验所需的样本量,对于2个因变量,样本量增加了约20%;对于3个因变量,样本量增加了30%。
所描述的策略针对多个主要疗效指标调整α值,进而调整样本量以保持统计效能。尽管该策略并非新颖,但在精神药理学试验中通常被忽视。在制定临床试验方案时,必须预先指定并谨慎限制主要疗效指标的数量。如果方案中指定了多次检验,应预期α水平调整并将其纳入样本量计算中。