• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

显示项中的 P 值无处不在,几乎总是具有统计学意义:对顶级科学期刊的调查。

P values in display items are ubiquitous and almost invariably significant: A survey of top science journals.

机构信息

Meta-Research Innovation Center at Stanford (METRICS), Stanford University, Stanford, California, United States of America.

Department of Clinical Psychology and Psychotherapy, Babes-Bolyai University, Cluj-Napoca Romania.

出版信息

PLoS One. 2018 May 15;13(5):e0197440. doi: 10.1371/journal.pone.0197440. eCollection 2018.

DOI:10.1371/journal.pone.0197440
PMID:29763472
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5953482/
Abstract

P values represent a widely used, but pervasively misunderstood and fiercely contested method of scientific inference. Display items, such as figures and tables, often containing the main results, are an important source of P values. We conducted a survey comparing the overall use of P values and the occurrence of significant P values in display items of a sample of articles in the three top multidisciplinary journals (Nature, Science, PNAS) in 2017 and, respectively, in 1997. We also examined the reporting of multiplicity corrections and its potential influence on the proportion of statistically significant P values. Our findings demonstrated substantial and growing reliance on P values in display items, with increases of 2.5 to 14.5 times in 2017 compared to 1997. The overwhelming majority of P values (94%, 95% confidence interval [CI] 92% to 96%) were statistically significant. Methods to adjust for multiplicity were almost non-existent in 1997, but reported in many articles relying on P values in 2017 (Nature 68%, Science 48%, PNAS 38%). In their absence, almost all reported P values were statistically significant (98%, 95% CI 96% to 99%). Conversely, when any multiplicity corrections were described, 88% (95% CI 82% to 93%) of reported P values were statistically significant. Use of Bayesian methods was scant (2.5%) and rarely (0.7%) articles relied exclusively on Bayesian statistics. Overall, wider appreciation of the need for multiplicity corrections is a welcome evolution, but the rapid growth of reliance on P values and implausibly high rates of reported statistical significance are worrisome.

摘要

P 值是一种广泛使用的科学推理方法,但却普遍存在误解和激烈争议。包含主要结果的图表和表格等显示项是 P 值的重要来源。我们进行了一项调查,比较了三个顶级多学科期刊(《自然》《科学》《美国国家科学院院刊》)2017 年和 1997 年样本文章中显示项中 P 值的总体使用情况和显著 P 值的出现情况。我们还研究了多重校正的报告及其对统计学上显著 P 值比例的潜在影响。我们的研究结果表明,显示项中对 P 值的依赖程度显著增加,2017 年与 1997 年相比,增加了 2.5 至 14.5 倍。绝大多数 P 值(94%,95%置信区间 [CI]92%至 96%)具有统计学意义。1997 年几乎不存在用于调整多重性的方法,但在 2017 年依赖 P 值的许多文章中都有报道(《自然》68%,《科学》48%,《美国国家科学院院刊》38%)。在没有这些方法的情况下,几乎所有报告的 P 值都具有统计学意义(98%,95%CI96%至 99%)。相反,当描述了任何多重性校正时,88%(95%CI82%至 93%)报告的 P 值具有统计学意义。贝叶斯方法的使用很少(2.5%),很少(0.7%)的文章完全依赖贝叶斯统计学。总的来说,更广泛地认识到需要进行多重性校正,这是一个受欢迎的发展,但对 P 值的依赖迅速增加和报告的统计学意义率高得令人难以置信,这令人担忧。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/39dc/5953482/525445ea5199/pone.0197440.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/39dc/5953482/d38d901748e0/pone.0197440.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/39dc/5953482/525445ea5199/pone.0197440.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/39dc/5953482/d38d901748e0/pone.0197440.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/39dc/5953482/525445ea5199/pone.0197440.g002.jpg

相似文献

1
P values in display items are ubiquitous and almost invariably significant: A survey of top science journals.显示项中的 P 值无处不在,几乎总是具有统计学意义:对顶级科学期刊的调查。
PLoS One. 2018 May 15;13(5):e0197440. doi: 10.1371/journal.pone.0197440. eCollection 2018.
2
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
3
Article and journal impact factor in various scientific fields.不同科学领域的文章及期刊影响因子。
Am J Med Sci. 2008 Mar;335(3):188-91. doi: 10.1097/MAJ.0b013e318145abb9.
4
Improving the peer-review process and editorial quality: key errors escaping the review and editorial process in top scientific journals.改进同行评审流程和编辑质量:顶级科学期刊中未被评审和编辑流程发现的关键错误
PeerJ. 2016 Feb 9;4:e1670. doi: 10.7717/peerj.1670. eCollection 2016.
5
Japanese representation in leading general medicine and basic science journals: a comparison of two decades.日本在主流内科学和基础科学期刊中的代表性:二十年对比研究。
Tohoku J Exp Med. 2013 Nov;231(3):187-91. doi: 10.1620/tjem.231.187.
6
Consolidated standards of reporting trials (CONSORT) and the completeness of reporting of randomised controlled trials (RCTs) published in medical journals.试验报告的统一标准(CONSORT)以及医学期刊上发表的随机对照试验(RCT)的报告完整性。
Cochrane Database Syst Rev. 2012 Nov 14;11(11):MR000030. doi: 10.1002/14651858.MR000030.pub2.
7
Evolution of Reporting P Values in the Biomedical Literature, 1990-2015.1990 年至 2015 年生物医学文献中报告 P 值的演变。
JAMA. 2016 Mar 15;315(11):1141-8. doi: 10.1001/jama.2016.1952.
8
Statistical Inference in Abstracts Published in Cardiovascular Journals.心血管期刊论文摘要中的统计推断。
J Am Coll Cardiol. 2021 Mar 30;77(12):1554-1561. doi: 10.1016/j.jacc.2021.01.031.
9
Tracing scientific reasoning in psychiatry: Reporting of statistical inference in abstracts of top journals 1975-2015.探寻精神病学中的科学推理:1975 - 2015年顶级期刊摘要中统计推断的报告
Int J Methods Psychiatr Res. 2018 Aug 2;27(3):e1735. doi: 10.1002/mpr.1735.
10
Seven items were identified for inclusion when reporting a Bayesian analysis of a clinical study.在报告临床研究的贝叶斯分析时,确定了七项纳入内容。
J Clin Epidemiol. 2005 Mar;58(3):261-8. doi: 10.1016/j.jclinepi.2004.08.010.

引用本文的文献

1
Statistical significance and publication reporting bias in abstracts of reproductive medicine studies.生殖医学研究摘要中的统计学显著性与发表报告偏倚
Hum Reprod. 2023 Nov 28;39(3):548-558. doi: 10.1093/humrep/dead248.
2
Impact of averaging fNIRS regional coherence data when monitoring people with long term post-concussion symptoms.监测有长期脑震荡后症状的人群时,对功能近红外光谱区域相干数据进行平均处理的影响。
Neurophotonics. 2023 Jul;10(3):035005. doi: 10.1117/1.NPh.10.3.035005. Epub 2023 Jul 4.
3
Probabilistic risk assessment - the keystone for the future of toxicology.

本文引用的文献

1
Remove, rather than redefine, statistical significance.去除而非重新定义统计学显著性。
Nat Hum Behav. 2018 Jan;2(1):4. doi: 10.1038/s41562-017-0224-0.
2
Redefine statistical significance.重新定义统计学显著性。
Nat Hum Behav. 2018 Jan;2(1):6-10. doi: 10.1038/s41562-017-0189-z.
3
The Proposal to Lower P Value Thresholds to .005.将P值阈值降至0.005的提议。
概率风险评估——毒理学的未来基石。
ALTEX. 2022;39(1):3-29. doi: 10.14573/altex.2201081.
4
The reporting of values, confidence intervals and statistical significance in Preventive Veterinary Medicine (1997-2017).《预防兽医学》(1997 - 2017年)中数值、置信区间及统计学显著性的报告
PeerJ. 2021 Nov 24;9:e12453. doi: 10.7717/peerj.12453. eCollection 2021.
5
Now is the time to assess the effects of open science practiceswith randomized control trials.现在是时候用随机对照试验来评估开放科学实践的效果了。
Am Psychol. 2022 Apr;77(3):467-475. doi: 10.1037/amp0000871. Epub 2021 Nov 22.
6
Current use of effect size or confidence interval analyses in clinical and biomedical research.效应量或置信区间分析在临床和生物医学研究中的当前应用。
Scientometrics. 2021;126(11):9133-9145. doi: 10.1007/s11192-021-04150-3. Epub 2021 Sep 18.
7
Prioritisation and design of clinical trials.临床试验的优先级排序和设计。
Eur J Epidemiol. 2021 Nov;36(11):1111-1121. doi: 10.1007/s10654-021-00761-5. Epub 2021 Jun 6.
8
Non-adjustment for multiple testing in multi-arm trials of distinct treatments: Rationale and justification.多臂试验中不同处理组不进行多重检验:原理与理由。
Clin Trials. 2020 Oct;17(5):562-566. doi: 10.1177/1740774520941419. Epub 2020 Jul 15.
9
Menagerie: A text-mining tool to support animal-human translation in neurodegeneration research.动物园:一种文本挖掘工具,用于支持神经退行性疾病研究中的动物-人类翻译。
PLoS One. 2019 Dec 17;14(12):e0226176. doi: 10.1371/journal.pone.0226176. eCollection 2019.
10
The magnitude of small-study effects in the : an empirical study of nearly 30 000 meta-analyses.《荟萃分析中小样本效应的幅度:近 30000 项荟萃分析的实证研究》。
BMJ Evid Based Med. 2020 Feb;25(1):27-32. doi: 10.1136/bmjebm-2019-111191. Epub 2019 Jul 4.
JAMA. 2018 Apr 10;319(14):1429-1430. doi: 10.1001/jama.2018.1536.
4
The intriguing evolution of effect sizes in biomedical research over time: smaller but more often statistically significant.随着时间的推移,生物医学研究中效应大小的有趣演变:越来越小但更经常具有统计学意义。
Gigascience. 2018 Jan 1;7(1):1-10. doi: 10.1093/gigascience/gix121.
5
A Bayesian model that jointly considers comparative effectiveness research and patients' preferences may help inform GRADE recommendations: an application to rheumatoid arthritis treatment recommendations.一种联合考虑比较有效性研究和患者偏好的贝叶斯模型可能有助于为 GRADE 建议提供信息:在类风湿关节炎治疗建议中的应用。
J Clin Epidemiol. 2018 Jan;93:56-65. doi: 10.1016/j.jclinepi.2017.10.003. Epub 2017 Oct 16.
6
The Failure of Null Hypothesis Significance Testing When Studying Incremental Changes, and What to Do About It.当研究增量变化时,零假设显著性检验的失败,以及应对方法。
Pers Soc Psychol Bull. 2018 Jan;44(1):16-23. doi: 10.1177/0146167217729162. Epub 2017 Sep 7.
7
When Null Hypothesis Significance Testing Is Unsuitable for Research: A Reassessment.当零假设显著性检验不适用于研究时:重新评估
Front Hum Neurosci. 2017 Aug 3;11:390. doi: 10.3389/fnhum.2017.00390. eCollection 2017.
8
Application of a Bayesian non-linear model hybrid scheme to sequence data for genomic prediction and QTL mapping.贝叶斯非线性模型混合方案在基因组预测和QTL定位序列数据中的应用。
BMC Genomics. 2017 Aug 15;18(1):618. doi: 10.1186/s12864-017-4030-x.
9
Bayesian inference for psychology. Part I: Theoretical advantages and practical ramifications.贝叶斯推断在心理学中的应用。第一部分:理论优势与实际影响。
Psychon Bull Rev. 2018 Feb;25(1):35-57. doi: 10.3758/s13423-017-1343-3.
10
A Bayesian heteroscedastic GLM with application to fMRI data with motion spikes.带有运动尖峰的 fMRI 数据的贝叶斯异方差 GLM
Neuroimage. 2017 Jul 15;155:354-369. doi: 10.1016/j.neuroimage.2017.04.069. Epub 2017 May 1.