Suppr超能文献

多项研究积累的证据表明:一致的方法可防止由 p- 值操纵产生的错误发现。

Accumulating evidence across studies: Consistent methods protect against false findings produced by p-hacking.

机构信息

Department of Psychology, Ohio State University, Columbus, Ohio, United States of America.

Department of Psychology, Queen's University, Kingston, Ontario, Canada.

出版信息

PLoS One. 2024 Aug 29;19(8):e0307999. doi: 10.1371/journal.pone.0307999. eCollection 2024.

Abstract

Much empirical science involves evaluating alternative explanations for the obtained data. For example, given certain assumptions underlying a statistical test, a "significant" result generally refers to implausibility of a null (zero) effect in the population producing the obtained study data. However, methodological work on various versions of p-hacking (i.e., using different analysis strategies until a "significant" result is produced) questions whether significant p-values might often reflect false findings. Indeed, initial simulations of single studies showed that the potential for finding "significant" but false findings might be much higher than the nominal .05 value when various analysis flexibilities are undertaken. In many settings, however, research articles report multiple studies using consistent methods across the studies, where those consistent methods would constrain the flexibilities used to produce high false-finding rates for simulations of single studies. Thus, we conducted simulations of study sets. These simulations show that consistent methods across studies (i.e., consistent in terms of which measures are analyzed, which conditions are included, and whether and how covariates are included) dramatically reduce the potential for flexible research practices (p-hacking) to produce consistent sets of significant results across studies. For p-hacking to produce even modest probabilities of a consistent set of studies would require (a) a large amount of selectivity in study reporting and (b) severe (and quite intentional) versions of p-hacking. With no more than modest selective reporting and with consistent methods across studies, p-hacking does not provide a plausible explanation for consistent empirical results across studies, especially as the size of the reported study set increases. In addition, the simulations show that p-hacking can produce high rates of false findings for single studies with very large samples. In contrast, a series of methodologically-consistent studies (even with much smaller samples) is much less vulnerable to the forms of p-hacking examined in the simulations.

摘要

许多实证科学都涉及到对所获得数据的替代解释进行评估。例如,在统计检验所基于的某些假设下,“显著”的结果通常是指在产生所获得研究数据的总体中,零效应的可能性不大。然而,关于各种版本的 p 值操纵(即使用不同的分析策略,直到产生“显著”的结果)的方法学工作质疑了显著的 p 值是否经常反映虚假发现。事实上,对单一研究的初步模拟表明,当采用各种分析灵活性时,发现“显著”但虚假发现的可能性可能远高于名义的 0.05 值。然而,在许多情况下,研究论文报告了使用一致方法的多项研究,这些一致的方法将限制产生高虚假发现率的灵活性,以便对单一研究进行模拟。因此,我们进行了研究集的模拟。这些模拟表明,跨研究的一致方法(即,在分析哪些措施、包括哪些条件、以及是否和如何包括协变量方面一致)极大地降低了灵活的研究实践(p 值操纵)产生一致的研究结果的可能性。为了使 p 值操纵产生一致的研究结果的可能性即使是适度的,也需要(a)在研究报告中进行大量的选择性,以及(b)严重(且相当有意的)版本的 p 值操纵。在没有适度的选择性报告和跨研究的一致方法的情况下,p 值操纵并不能为跨研究的一致实证结果提供一个合理的解释,尤其是当报告的研究集规模增加时。此外,模拟表明,p 值操纵可以为非常大样本的单一研究产生高比例的虚假发现。相比之下,一系列方法上一致的研究(即使样本较小),受到模拟中检查的 p 值操纵形式的影响要小得多。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/60e1/11361653/001d2bc2e097/pone.0307999.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验