考虑多个终点时的检验效能和样本量。

Power and sample size when multiple endpoints are considered.

作者信息

Senn Stephen, Bretz Frank

机构信息

Department of Statistics, University of Glasgow, Glasgow, Scotland, UK.

出版信息

Pharm Stat. 2007 Jul-Sep;6(3):161-70. doi: 10.1002/pst.301.

DOI:10.1002/pst.301

PMID:17674404

Abstract

A common approach to analysing clinical trials with multiple outcomes is to control the probability for the trial as a whole of making at least one incorrect positive finding under any configuration of true and false null hypotheses. Popular approaches are to use Bonferroni corrections or structured approaches such as, for example, closed-test procedures. As is well known, such strategies, which control the family-wise error rate, typically reduce the type I error for some or all the tests of the various null hypotheses to below the nominal level. In consequence, there is generally a loss of power for individual tests. What is less well appreciated, perhaps, is that depending on approach and circumstances, the test-wise loss of power does not necessarily lead to a family wise loss of power. In fact, it may be possible to increase the overall power of a trial by carrying out tests on multiple outcomes without increasing the probability of making at least one type I error when all null hypotheses are true. We examine two types of problems to illustrate this. Unstructured testing problems arise typically (but not exclusively) when many outcomes are being measured. We consider the case of more than two hypotheses when a Bonferroni approach is being applied while for illustration we assume compound symmetry to hold for the correlation of all variables. Using the device of a latent variable it is easy to show that power is not reduced as the number of variables tested increases, provided that the common correlation coefficient is not too high (say less than 0.75). Afterwards, we will consider structured testing problems. Here, multiplicity problems arising from the comparison of more than two treatments, as opposed to more than one measurement, are typical. We conduct a numerical study and conclude again that power is not reduced as the number of tested variables increases.

摘要

分析具有多个结果的临床试验的一种常见方法是，在真零假设和假零假设的任何配置下，控制整个试验至少做出一个错误阳性发现的概率。常用方法是使用邦费罗尼校正或结构化方法，例如封闭检验程序。众所周知，这些控制族错误率的策略通常会将各种零假设的部分或所有检验的I型错误降低到名义水平以下。因此，单个检验的功效通常会降低。或许不太为人所知的是，根据方法和情况，检验层面的功效损失不一定会导致族层面的功效损失。事实上，在所有零假设为真时，通过对多个结果进行检验，有可能在不增加至少出现一个I型错误概率的情况下提高试验的总体功效。我们通过两类问题来说明这一点。非结构化检验问题通常（但不唯一）出现在测量多个结果时。当应用邦费罗尼方法时，我们考虑多于两个假设的情况，为便于说明，我们假设所有变量的相关性具有复合对称性。使用潜变量工具很容易证明，只要共同相关系数不太高（比如小于0.75），随着检验变量数量的增加，功效不会降低。之后，我们将考虑结构化检验问题。这里，与多于一次测量相反，由多于两种治疗方法的比较引起的多重性问题很典型。我们进行了一项数值研究，并再次得出结论，随着检验变量数量的增加，功效不会降低。