多项研究积累的证据表明：一致的方法可防止由 p- 值操纵产生的错误发现。

Accumulating evidence across studies: Consistent methods protect against false findings produced by p-hacking.

机构信息

Department of Psychology, Ohio State University, Columbus, Ohio, United States of America.

Department of Psychology, Queen's University, Kingston, Ontario, Canada.

出版信息

PLoS One. 2024 Aug 29;19(8):e0307999. doi: 10.1371/journal.pone.0307999. eCollection 2024.

DOI:10.1371/journal.pone.0307999

PMID:39208346

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11361653/

Abstract

Much empirical science involves evaluating alternative explanations for the obtained data. For example, given certain assumptions underlying a statistical test, a "significant" result generally refers to implausibility of a null (zero) effect in the population producing the obtained study data. However, methodological work on various versions of p-hacking (i.e., using different analysis strategies until a "significant" result is produced) questions whether significant p-values might often reflect false findings. Indeed, initial simulations of single studies showed that the potential for finding "significant" but false findings might be much higher than the nominal .05 value when various analysis flexibilities are undertaken. In many settings, however, research articles report multiple studies using consistent methods across the studies, where those consistent methods would constrain the flexibilities used to produce high false-finding rates for simulations of single studies. Thus, we conducted simulations of study sets. These simulations show that consistent methods across studies (i.e., consistent in terms of which measures are analyzed, which conditions are included, and whether and how covariates are included) dramatically reduce the potential for flexible research practices (p-hacking) to produce consistent sets of significant results across studies. For p-hacking to produce even modest probabilities of a consistent set of studies would require (a) a large amount of selectivity in study reporting and (b) severe (and quite intentional) versions of p-hacking. With no more than modest selective reporting and with consistent methods across studies, p-hacking does not provide a plausible explanation for consistent empirical results across studies, especially as the size of the reported study set increases. In addition, the simulations show that p-hacking can produce high rates of false findings for single studies with very large samples. In contrast, a series of methodologically-consistent studies (even with much smaller samples) is much less vulnerable to the forms of p-hacking examined in the simulations.

摘要

许多实证科学都涉及到对所获得数据的替代解释进行评估。例如，在统计检验所基于的某些假设下，“显著”的结果通常是指在产生所获得研究数据的总体中，零效应的可能性不大。然而，关于各种版本的 p 值操纵（即使用不同的分析策略，直到产生“显著”的结果）的方法学工作质疑了显著的 p 值是否经常反映虚假发现。事实上，对单一研究的初步模拟表明，当采用各种分析灵活性时，发现“显著”但虚假发现的可能性可能远高于名义的 0.05 值。然而，在许多情况下，研究论文报告了使用一致方法的多项研究，这些一致的方法将限制产生高虚假发现率的灵活性，以便对单一研究进行模拟。因此，我们进行了研究集的模拟。这些模拟表明，跨研究的一致方法（即，在分析哪些措施、包括哪些条件、以及是否和如何包括协变量方面一致）极大地降低了灵活的研究实践（p 值操纵）产生一致的研究结果的可能性。为了使 p 值操纵产生一致的研究结果的可能性即使是适度的，也需要（a）在研究报告中进行大量的选择性，以及（b）严重（且相当有意的）版本的 p 值操纵。在没有适度的选择性报告和跨研究的一致方法的情况下，p 值操纵并不能为跨研究的一致实证结果提供一个合理的解释，尤其是当报告的研究集规模增加时。此外，模拟表明，p 值操纵可以为非常大样本的单一研究产生高比例的虚假发现。相比之下，一系列方法上一致的研究（即使样本较小），受到模拟中检查的 p 值操纵形式的影响要小得多。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/60e1/11361653/001d2bc2e097/pone.0307999.g001.jpg

相似文献

Accumulating evidence across studies: Consistent methods protect against false findings produced by p-hacking.多项研究积累的证据表明：一致的方法可防止由 p- 值操纵产生的错误发现。

PLoS One. 2024 Aug 29;19(8):e0307999. doi: 10.1371/journal.pone.0307999. eCollection 2024.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Is There Evidence of P-Hacking in Imaging Research?影像学研究中存在 P 操纵证据吗？

Can Assoc Radiol J. 2023 Aug;74(3):497-507. doi: 10.1177/08465371221139418. Epub 2022 Nov 22.

p-Hacking and publication bias interact to distort meta-analytic effect size estimates.p 值操纵和发表偏倚相互作用，扭曲了荟萃分析效应量的估计。

Psychol Methods. 2020 Aug;25(4):456-471. doi: 10.1037/met0000246. Epub 2019 Dec 2.

Impact of redefining statistical significance on P-hacking and false positive rates: An agent-based model.重新定义统计学显著性对 P 值操纵和假阳性率的影响：基于代理的模型。

PLoS One. 2024 May 16;19(5):e0303262. doi: 10.1371/journal.pone.0303262. eCollection 2024.

Big little lies: a compendium and simulation of -hacking strategies.弥天大谎：-黑客攻击策略汇编与模拟

R Soc Open Sci. 2023 Feb 8;10(2):220346. doi: 10.1098/rsos.220346. eCollection 2023 Feb.

Tempest in a teacup: An analysis of p-Hacking in organizational research.小题大做：组织研究中 p-值操纵的分析。

PLoS One. 2023 Feb 24;18(2):e0281938. doi: 10.1371/journal.pone.0281938. eCollection 2023.

Response to letter to the editor from Dr Rahman Shiri: The challenging topic of suicide across occupational groups.回复拉赫曼·希里博士的来信：职业群体中的自杀这一具有挑战性的话题。

Scand J Work Environ Health. 2018 Jan 1;44(1):108-110. doi: 10.5271/sjweh.3698. Epub 2017 Dec 8.

Is N-Hacking Ever OK? The consequences of collecting more data in pursuit of statistical significance.N 操弄是否可以接受？为了追求统计学意义而收集更多数据的后果。

PLoS Biol. 2023 Nov 1;21(11):e3002345. doi: 10.1371/journal.pbio.3002345. eCollection 2023 Nov.

本文引用的文献

Is N-Hacking Ever OK? The consequences of collecting more data in pursuit of statistical significance.N 操弄是否可以接受？为了追求统计学意义而收集更多数据的后果。

PLoS Biol. 2023 Nov 1;21(11):e3002345. doi: 10.1371/journal.pbio.3002345. eCollection 2023 Nov.

A revised and expanded taxonomy for understanding heterogeneity in research and reporting practices.一种用于理解研究和报告实践中异质性的修订和扩展分类法。

Psychol Methods. 2024 Apr;29(2):350-361. doi: 10.1037/met0000488. Epub 2022 Apr 11.

Prevalence of questionable research practices, research misconduct and their potential explanatory factors: A survey among academic researchers in The Netherlands.可疑研究行为、研究不端行为及其潜在解释因素的流行程度：荷兰学术研究人员的调查。

PLoS One. 2022 Feb 16;17(2):e0263023. doi: 10.1371/journal.pone.0263023. eCollection 2022.

Evaluating Research in Personality and Social Psychology: Considerations of Statistical Power and Concerns About False Findings.评估人格与社会心理学研究：统计功效的考虑与对错误发现的关注。

Pers Soc Psychol Bull. 2022 Jul;48(7):1105-1117. doi: 10.1177/01461672211030811. Epub 2021 Jul 26.

A Validity-Based Framework for Understanding Replication in Psychology.基于有效性的心理学复制研究理解框架。

Pers Soc Psychol Rev. 2020 Nov;24(4):316-344. doi: 10.1177/1088868320931366. Epub 2020 Jul 27.

Is Preregistration Worthwhile?预注册是否值得？

Trends Cogn Sci. 2020 Feb;24(2):94-95. doi: 10.1016/j.tics.2019.11.009. Epub 2019 Dec 28.

Preregistration Is Hard, And Worthwhile.预先注册很难，但很有价值。

Trends Cogn Sci. 2019 Oct;23(10):815-818. doi: 10.1016/j.tics.2019.07.009. Epub 2019 Aug 14.

Psychology's Renaissance.心理学的复兴。

Annu Rev Psychol. 2018 Jan 4;69:511-534. doi: 10.1146/annurev-psych-122216-011836. Epub 2017 Oct 25.

Benefits of open and high-powered research outweigh costs.开放和高影响力的研究带来的好处超过了成本。

J Pers Soc Psychol. 2017 Aug;113(2):230-243. doi: 10.1037/pspi0000049.

Continuously Cumulating Meta-Analysis and Replicability.持续累积荟萃分析与可重复性。

Perspect Psychol Sci. 2014 May;9(3):333-42. doi: 10.1177/1745691614529796.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

多项研究积累的证据表明：一致的方法可防止由 p- 值操纵产生的错误发现。

Accumulating evidence across studies: Consistent methods protect against false findings produced by p-hacking.

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献