Gøtzsche Peter C
Nordic Cochrane Centre, H:S Rigshospitalet, DK-2100 Copenhagen Ø, Denmark,.
BMJ. 2006 Jul 29;333(7561):231-4. doi: 10.1136/bmj.38895.410451.79. Epub 2006 Jul 19.
To compare the distribution of P values in abstracts of randomised controlled trials with that in observational studies, and to check P values between 0.04 and 0.06.
Cross sectional study of all 260 abstracts in PubMed of articles published in 2003 that contained "relative risk" or "odds ratio" and reported results from a randomised trial, and random samples of 130 abstracts from cohort studies and 130 from case-control studies. P values were noted or calculated if unreported.
Prevalence of significant P values in abstracts and distribution of P values between 0.04 and 0.06.
The first result in the abstract was statistically significant in 70% of the trials, 84% of cohort studies, and 84% of case-control studies. Although many of these results were derived from subgroup or secondary analyses, or biased selection of results, they were presented without reservations in 98% of the trials. P values were more extreme in observational studies (P < 0.001) and in cohort studies than in case-control studies (P = 0.04). The distribution of P values around P = 0.05 was extremely skewed. Only five trials had 0.05 < or = P < 0.06, whereas 29 trials had 0.04 < or = P < 0.05. I could check the calculations for 27 of these trials. One of four non-significant results was significant. Four of the 23 significant results were wrong, five were doubtful, and four could be discussed. Nine cohort studies and eight case-control studies reported P values between 0.04 and 0.06, but in all 17 cases P < 0.05. Because the analyses had been adjusted for confounders, these results could not be checked.
Significant results in abstracts are common but should generally be disbelieved.
比较随机对照试验摘要中P值的分布与观察性研究摘要中P值的分布,并检查0.04至0.06之间的P值。
对2003年发表在PubMed上的所有260篇包含“相对风险”或“比值比”并报告随机试验结果的文章摘要进行横断面研究,以及从队列研究中随机抽取130篇摘要和从病例对照研究中随机抽取130篇摘要。若P值未报告,则进行记录或计算。
摘要中显著P值的患病率以及0.04至0.06之间P值的分布。
摘要中的首个结果在70%的试验、84%的队列研究和84%的病例对照研究中具有统计学显著性。尽管这些结果中有许多来自亚组分析或二次分析,或存在结果偏倚选择,但在98%的试验中这些结果都毫无保留地呈现了出来。观察性研究(P<0.001)和队列研究中的P值比病例对照研究中的P值更极端(P=0.04)。P值在P=0.05附近的分布极度不对称。只有5项试验的0.05≤P<0.06,而29项试验的0.04≤P<0.05。我可以检查其中27项试验的计算结果。四项非显著性结果中有一项具有显著性。23项显著性结果中有四项错误,五项存疑,四项有待讨论。九项队列研究和八项病例对照研究报告的P值在0.04至0.06之间,但在所有17例中P<0.05。由于分析已对混杂因素进行了校正,因此无法检查这些结果。
摘要中的显著性结果很常见,但通常不应轻信。