Morris Nathan, Elston Robert
Case Western Reserve University Cleveland, OH 44106-7281, USA.
Am Stat. 2011 Jan 1;65(3). doi: 10.1198/tast.2011.10117.
It is an obvious fact that the power of a test statistic is dependent upon the significance (alpha) level at which the test is performed. It is perhaps a less obvious fact that the performance of two statistics in terms of power is also a function of the alpha level. Through numerous personal discussions, we have noted that even some competent statisticians have the mistaken intuition that relative power comparisons at traditional levels such as = 0.05 will be roughly similar to relative power comparisons at very low levels, such as the level = 5 × 10, which is commonly used in genome-wide association studies. In this brief note, we demonstrate that this notion is in fact quite wrong, especially with respect to comparing tests with differing degrees of freedom. In fact, at very low alpha levels the cost of additional degrees of freedom is often comparatively low. Thus we recommend that statisticians exercise caution when interpreting the results of power comparison studies which use alpha levels that will not be used in practice.
一个明显的事实是,检验统计量的功效取决于进行检验时的显著性(α)水平。或许一个不太明显的事实是,两个统计量在功效方面的表现也是α水平的函数。通过大量的个人讨论,我们注意到,即使是一些有能力的统计学家也有错误的直觉,认为在传统水平(如α = 0.05)下的相对功效比较与在非常低的水平(如在全基因组关联研究中常用的α = 5×10⁻⁸水平)下的相对功效比较大致相似。在本简短说明中,我们证明了这种观念实际上是相当错误的,尤其是在比较具有不同自由度的检验时。事实上,在非常低的α水平下,额外自由度的代价通常相对较低。因此,我们建议统计学家在解释使用在实际中不会使用的α水平的功效比较研究结果时要谨慎。