Pulver A E, Bartko J J, McGrath J A
University of Maryland School of Medicine, Maryland Psychiatric Research Center, Baltimore 21228.
Psychiatry Res. 1988 Mar;23(3):295-9. doi: 10.1016/0165-1781(88)90020-0.
Failure to consider statistical power when achieving apparently "negative" results prevents accurate interpretation of the results. A nonsignificant result can be obtained when one includes an insufficient number of subjects to permit observation of a true effect (low power to detect an effect), or when one has an adequate number of subjects, but a meaningful effect does not exist (high power, no effect); one can also have a situation of lower power and no real effect. Without considering power, one is unable to distinguish a "negative" experiment from an inadequate one. This article examines 154 published nonsignificant t-test results. When power is calculated with an effect size equal to a standardized difference of unity, over 50% of the tests have inadequate power.
在得出明显“阴性”结果时未考虑统计效能会妨碍对结果的准确解读。当纳入的研究对象数量不足,无法观察到真实效应(检测效应的效能低),或者研究对象数量充足但不存在有意义的效应(效能高,无效应)时,可能会得到无统计学意义的结果;也可能存在效能较低且无实际效应的情况。如果不考虑效能,就无法区分“阴性”实验和设计不充分的实验。本文研究了154篇已发表的无统计学意义的t检验结果。当效应量等于标准化差异为1来计算效能时,超过50%的检验效能不足。