Sexton S A, Ferguson N, Pearce C, Ricketts D M
Department of Orthopaedic Surgery, Princess Royal Hospital, Haywards Heath, West Sussex, UK.
Ann R Coll Surg Engl. 2008 Jan;90(1):58-61. doi: 10.1308/003588408X242312.
Many studies published in medical journals do not consider the statistical power required to detect a meaningful difference between study groups. As a result, these studies are often underpowered: the sample size may not be large enough to pick up a statistically significant difference (or other effect of interest) of a given size between the study groups. Therefore, the conclusion that there is no statistically significant difference between groups cannot be made unless a study has been shown to have sufficient power. The aim of this study was to establish the prevalence of negative studies with inadequate statistical power in British journals to which orthopaedic surgeons regularly submit.
We assessed all papers in the last consecutive six issues prior to the start of the study (April 2005) in The Journal of Bone and Joint Surgery (British), Injury, and Annals of the Royal College of Surgeons of England. We sought published evidence that a power analysis had been performed in association with the main hypothesis of the paper.
There were a total of 170 papers in which a statistical comparison of two or more groups was undertaken. Of these 170 papers, 49 (28.8%) stated as their primary conclusion that there was no statistically significant difference between the groups studied. Of these 49 papers, only 3 (6.1%) had performed a power analysis demonstrating adequate sample size.
These results demonstrate that the majority of negative studies in the British orthopaedic literature that we have looked at have not performed the statistical analysis necessary to reach their stated conclusions. In order to remedy this, we recommend that the journals sampled include the following guidance in their instructions to authors: the statement 'no statistically significant difference was found between study groups' should be accompanied by the results of a power analysis.
许多发表在医学期刊上的研究没有考虑检测研究组之间有意义差异所需的统计功效。因此,这些研究往往功效不足:样本量可能不够大,无法发现研究组之间给定大小的统计学显著差异(或其他感兴趣的效应)。所以,除非一项研究已被证明有足够的功效,否则不能得出组间无统计学显著差异的结论。本研究的目的是确定骨科医生经常投稿的英国期刊中统计功效不足的阴性研究的发生率。
我们评估了在研究开始前(2005年4月)《骨与关节外科杂志(英国版)》、《损伤》和《英国皇家外科医学院学报》最近连续六期的所有论文。我们寻找已发表的证据,证明与论文的主要假设相关的功效分析已经进行。
共有170篇论文对两个或更多组进行了统计比较。在这170篇论文中,49篇(28.8%)将研究组之间无统计学显著差异作为其主要结论。在这49篇论文中,只有3篇(6.1%)进行了功效分析,表明样本量足够。
这些结果表明,我们所研究的英国骨科文献中的大多数阴性研究没有进行得出其所述结论所需的统计分析。为了纠正这一点,我们建议抽样的期刊在给作者的投稿指南中纳入以下指导:“研究组之间未发现统计学显著差异”这一表述应附上功效分析的结果。