重新定义统计学显著性对 P 值操纵和假阳性率的影响：基于代理的模型。

Impact of redefining statistical significance on P-hacking and false positive rates: An agent-based model.

机构信息

Department of Mathematics, Loyola Marymount University, Los Angeles, California, United States of America.

Tempest Technologies, Los Angeles, California, United States of America.

出版信息

PLoS One. 2024 May 16;19(5):e0303262. doi: 10.1371/journal.pone.0303262. eCollection 2024.

DOI:10.1371/journal.pone.0303262

PMID:38753677

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11098386/

Abstract

In recent years, concern has grown about the inappropriate application and interpretation of P values, especially the use of P<0.05 to denote "statistical significance" and the practice of P-hacking to produce results below this threshold and selectively reporting these in publications. Such behavior is said to be a major contributor to the large number of false and non-reproducible discoveries found in academic journals. In response, it has been proposed that the threshold for statistical significance be changed from 0.05 to 0.005. The aim of the current study was to use an evolutionary agent-based model comprised of researchers who test hypotheses and strive to increase their publication rates in order to explore the impact of a 0.005 P value threshold on P-hacking and published false positive rates. Three scenarios were examined, one in which researchers tested a single hypothesis, one in which they tested multiple hypotheses using a P<0.05 threshold, and one in which they tested multiple hypotheses using a P<0.005 threshold. Effects sizes were varied across models and output assessed in terms of researcher effort, number of hypotheses tested and number of publications, and the published false positive rate. The results supported the view that a more stringent P value threshold can serve to reduce the rate of published false positive results. Researchers still engaged in P-hacking with the new threshold, but the effort they expended increased substantially and their overall productivity was reduced, resulting in a decline in the published false positive rate. Compared to other proposed interventions to improve the academic publishing system, changing the P value threshold has the advantage of being relatively easy to implement and could be monitored and enforced with minimal effort by journal editors and peer reviewers.

摘要

近年来，人们对 P 值的不当应用和解释越来越关注，尤其是使用 P<0.05 来表示“统计学意义”，以及为了得到低于该阈值的结果而进行 P 值操纵，并选择性地在出版物中报告这些结果。这种行为被认为是导致学术期刊中大量虚假和不可重现的发现的主要原因之一。有鉴于此，有人建议将统计学显著性的阈值从 0.05 改为 0.005。本研究的目的是使用一个由研究人员组成的进化代理模型，这些研究人员检验假设并努力提高他们的发表率，以探讨 0.005 的 P 值阈值对 P 值操纵和发表的假阳性率的影响。我们检查了三种情况，一种是研究人员检验一个单一假设，一种是他们使用 P<0.05 阈值检验多个假设，还有一种是他们使用 P<0.005 阈值检验多个假设。在模型中，我们改变了效应大小，然后根据研究人员的努力、检验的假设数量和发表的论文数量，以及发表的假阳性率来评估输出。结果支持了这样一种观点，即更严格的 P 值阈值可以降低发表的假阳性结果的比率。研究人员仍然在进行 P 值操纵，但他们所花费的努力大大增加，整体生产力降低，导致发表的假阳性率下降。与其他旨在改善学术出版系统的干预措施相比，改变 P 值阈值具有相对容易实施的优势，期刊编辑和同行评审员只需付出最小的努力就可以进行监测和执行。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e505/11098386/86fec0d22fb5/pone.0303262.g002.jpg

相似文献

Impact of redefining statistical significance on P-hacking and false positive rates: An agent-based model.

PLoS One. 2024 May 16;19(5):e0303262. doi: 10.1371/journal.pone.0303262. eCollection 2024.

[Standard technical specifications for methacholine chloride (Methacholine) bronchial challenge test (2023)].

Zhonghua Jie He He Hu Xi Za Zhi. 2024 Feb 12;47(2):101-119. doi: 10.3760/cma.j.cn112147-20231019-00247.

Accumulating evidence across studies: Consistent methods protect against false findings produced by p-hacking.

PLoS One. 2024 Aug 29;19(8):e0307999. doi: 10.1371/journal.pone.0307999. eCollection 2024.

P-Hacking in Orthopaedic Literature: A Twist to the Tail.

J Bone Joint Surg Am. 2016 Oct 19;98(20):e91. doi: 10.2106/JBJS.16.00479.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

A Tutorial on Hunting Statistical Significance by Chasing .

Front Psychol. 2016 Sep 22;7:1444. doi: 10.3389/fpsyg.2016.01444. eCollection 2016.

Are most published research findings false? Trends in statistical power, publication selection bias, and the false discovery rate in psychology (1975-2017).

PLoS One. 2023 Oct 17;18(10):e0292717. doi: 10.1371/journal.pone.0292717. eCollection 2023.

The future of Cochrane Neonatal.

Early Hum Dev. 2020 Nov;150:105191. doi: 10.1016/j.earlhumdev.2020.105191. Epub 2020 Sep 12.

Are questionable research practices facilitating new discoveries in sport and exercise medicine? The proportion of supported hypotheses is implausibly high.

Br J Sports Med. 2020 Nov;54(22):1365-1371. doi: 10.1136/bjsports-2019-101863. Epub 2020 Jul 22.

Is There Evidence of P-Hacking in Imaging Research?

Can Assoc Radiol J. 2023 Aug;74(3):497-507. doi: 10.1177/08465371221139418. Epub 2022 Nov 22.

本文引用的文献

Empirical evidence of widespread exaggeration bias and selective reporting in ecology.

Nat Ecol Evol. 2023 Sep;7(9):1525-1536. doi: 10.1038/s41559-023-02144-3. Epub 2023 Aug 3.

Replication of the natural selection of bad science.

R Soc Open Sci. 2023 Feb 22;10(2):221306. doi: 10.1098/rsos.221306. eCollection 2023 Feb.

Lessons learnt from registration of biomedical research.

Nat Hum Behav. 2023 Jan;7(1):9-12. doi: 10.1038/s41562-022-01499-0.

Are Most Published Criminological Research Findings Wrong? Taking Stock of Criminological Research Using a Bayesian Simulation Approach.

Int J Offender Ther Comp Criminol. 2025 Apr;69(5):475-494. doi: 10.1177/0306624X221132997. Epub 2022 Nov 16.

Potential effects of lowering the threshold of statistical significance in the field of chronic rhinosinusitis - A meta-research on published randomized controlled trials over last decade.

Braz J Otorhinolaryngol. 2022 Nov-Dec;88 Suppl 5(Suppl 5):S83-S89. doi: 10.1016/j.bjorl.2021.11.004. Epub 2021 Dec 4.

Registration and primary outcome reporting in behavioral health trials.

BMC Med Res Methodol. 2022 Feb 6;22(1):41. doi: 10.1186/s12874-021-01500-w.

Can a registered trial be reported as a one-group, pretest-posttest study with no explanation? A critique of Williams et al. (2021).

Health Justice. 2022 Jan 3;10(1):2. doi: 10.1186/s40352-021-00165-3.

Why are not There More Bayesian Clinical Trials? Perceived Barriers and Educational Preferences Among Medical Researchers Involved in Drug Development.

Ther Innov Regul Sci. 2023 May;57(3):417-425. doi: 10.1007/s43441-021-00357-x. Epub 2022 Jan 3.

Data-dredging bias.

BMJ Evid Based Med. 2022 Aug;27(4):209-211. doi: 10.1136/bmjebm-2020-111584. Epub 2021 Dec 20.

The past, present and future of Registered Reports.

Nat Hum Behav. 2022 Jan;6(1):29-42. doi: 10.1038/s41562-021-01193-7. Epub 2021 Nov 15.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

重新定义统计学显著性对 P 值操纵和假阳性率的影响：基于代理的模型。

Impact of redefining statistical significance on P-hacking and false positive rates: An agent-based model.

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献