Hauck W W, Anderson S
Stat Med. 1986 May-Jun;5(3):203-9. doi: 10.1002/sim.4780050302.
An issue of continuing interest is the interpretation and reporting of 'negative' studies, namely studies that do not find statistically significant differences. The most common approach is the design-power method which determines, irrespective of the observed difference, what differences the study could have been expected to detect. We propose an alternative approach, the application of equivalence testing methods, where we define equivalence to mean that the actual difference lies within some specified limits. This approach, in contrast to the design-power approach, provides a way of quantifying (with p-values) what was actually determined from the study instead of saying what the study may or may not have accomplished with some degree of certainty (power). For example, a possible outcome of the equivalence testing approach is the conclusion at the 5 per cent level that two means (or proportions) do not differ by more than some specified amount. The equivalence testing approach applies to any study design. We illustrate the method with a cancer clinical trial and an epidemiologic case-control study. In addition, for those studies in which one cannot specify limits a priori, we propose the use of equivalence curves to summarize and present the study results.
一个持续受到关注的问题是“阴性”研究的解读与报告,即那些未发现具有统计学显著差异的研究。最常见的方法是设计效能法,该方法不考虑观察到的差异,而是确定研究预期能够检测到哪些差异。我们提出一种替代方法,即应用等效性检验方法,在此方法中,我们将等效性定义为实际差异落在某些指定限度内。与设计效能法相比,这种方法提供了一种(通过p值)量化从研究中实际确定内容的方式,而不是说明研究在某种程度上(效能)可能或不可能完成的事情。例如,等效性检验方法的一个可能结果是在5%的水平上得出两个均值(或比例)的差异不超过某个指定量的结论。等效性检验方法适用于任何研究设计。我们用一项癌症临床试验和一项流行病学病例对照研究来说明该方法。此外,对于那些无法事先指定限度的研究,我们建议使用等效性曲线来总结和呈现研究结果。