Institute of Health and Biomedical Innovation, Queensland University of Technology, Kelvin Grove, Queensland, Australia
Arthritis and Clinical Immunology Research Program, Division of Genomics and Data Sciences, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma, USA.
BMJ Open. 2019 Nov 21;9(11):e032506. doi: 10.1136/bmjopen-2019-032506.
Previous research has shown clear biases in the distribution of published p values, with an excess below the 0.05 threshold due to a combination of p-hacking and publication bias. We aimed to examine the bias for statistical significance using published confidence intervals.
Observational study.
Papers published in since 1976.
Over 968 000 confidence intervals extracted from abstracts and over 350 000 intervals extracted from the full-text.
Cumulative distributions of lower and upper confidence interval limits for ratio estimates.
We found an excess of statistically significant results with a glut of lower intervals just above one and upper intervals just below 1. These excesses have not improved in recent years. The excesses did not appear in a set of over 100 000 confidence intervals that were not subject to p-hacking or publication bias.
The huge excesses of published confidence intervals that are just below the statistically significant threshold are not statistically plausible. Large improvements in research practice are needed to provide more results that better reflect the truth.
先前的研究表明,已发表的 p 值分布存在明显的偏差,由于 p 值操纵和发表偏倚的综合作用,低于 0.05 阈值的 p 值过多。我们旨在使用已发表的置信区间来检验统计显著性的偏差。
观察性研究。
自 1976 年以来发表在《柳叶刀》上的论文。
从摘要中提取了超过 968000 个置信区间,从全文中提取了超过 350000 个置信区间。
比值估计的置信区间下限和上限的累积分布。
我们发现具有统计学意义的结果过多,且大量的下限刚好略高于 1,上限刚好略低于 1。近年来,这种过剩情况并没有改善。在一组不受 p 值操纵或发表偏倚影响的超过 100000 个置信区间中,没有出现这种过剩情况。
大量略低于统计学显著阈值的发表置信区间的巨大过剩情况在统计学上是不合理的。需要在研究实践中做出重大改进,以提供更多更好地反映真实情况的结果。