论心理科学的可重复性

On the Reproducibility of Psychological Science.

作者信息

Johnson Valen E, Payne Richard D, Wang Tianying, Asher Alex, Mandal Soutrik

机构信息

Department of Statistics, Texas A&M University, College Station, TX.

出版信息

J Am Stat Assoc. 2017;112(517):1-10. doi: 10.1080/01621459.2016.1240079. Epub 2016 Oct 7.

DOI:10.1080/01621459.2016.1240079

PMID:29861517

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5976261/

Abstract

Investigators from a large consortium of scientists recently performed a multi-year study in which they replicated 100 psychology experiments. Although statistically significant results were reported in 97% of the original studies, statistical significance was achieved in only 36% of the replicated studies. This article presents a reanalysis of these data based on a formal statistical model that accounts for publication bias by treating outcomes from unpublished studies as missing data, while simultaneously estimating the distribution of effect sizes for those studies that tested nonnull effects. The resulting model suggests that more than 90% of tests performed in eligible psychology experiments tested negligible effects, and that publication biases based on -values caused the observed rates of nonreproducibility. The results of this reanalysis provide a compelling argument for both increasing the threshold required for declaring scientific discoveries and for adopting statistical summaries of evidence that account for the high proportion of tested hypotheses that are false. Supplementary materials for this article are available online.

摘要

来自一个大型科学家联盟的研究人员最近进行了一项为期多年的研究，他们重复了100个心理学实验。尽管在97%的原始研究中报告了具有统计学意义的结果，但在重复研究中只有36%取得了统计学意义。本文基于一个正式的统计模型对这些数据进行了重新分析，该模型通过将未发表研究的结果视为缺失数据来考虑发表偏倚，同时估计那些测试非零效应的研究的效应大小分布。由此产生的模型表明，在符合条件的心理学实验中进行的测试中，超过90%测试的是可忽略不计的效应，并且基于P值的发表偏倚导致了观察到的不可重复性率。这一重新分析的结果为提高宣布科学发现所需的阈值以及采用考虑到大量测试假设为假的证据的统计总结提供了有力论据。本文的补充材料可在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f5cf/5976261/c6f4d908ce55/nihms967696f1.jpg

相似文献

On the Reproducibility of Psychological Science.论心理科学的可重复性

J Am Stat Assoc. 2017;112(517):1-10. doi: 10.1080/01621459.2016.1240079. Epub 2016 Oct 7.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Are most published research findings false? Trends in statistical power, publication selection bias, and the false discovery rate in psychology (1975-2017).发表的研究结果多数是错误的吗？心理学中统计功效、发表偏倚和虚报发现率的趋势（1975-2017）。

PLoS One. 2023 Oct 17;18(10):e0292717. doi: 10.1371/journal.pone.0292717. eCollection 2023.

A Bayesian Perspective on the Reproducibility Project: Psychology.关于“可重复性项目：心理学”的贝叶斯视角

PLoS One. 2016 Feb 26;11(2):e0149794. doi: 10.1371/journal.pone.0149794. eCollection 2016.

PSYCHOLOGY. Estimating the reproducibility of psychological science.心理学. 心理科学可重复性的评估.

Science. 2015 Aug 28;349(6251):aac4716. doi: 10.1126/science.aac4716.

Estimating statistical power, posterior probability and publication bias of psychological research using the observed replication rate.利用观察到的重复率估计心理学研究的统计功效、后验概率和发表偏倚。

R Soc Open Sci. 2018 Sep 12;5(9):181190. doi: 10.1098/rsos.181190. eCollection 2018 Sep.

Statistical significance and publication reporting bias in abstracts of reproductive medicine studies.生殖医学研究摘要中的统计学显著性与发表报告偏倚

Hum Reprod. 2023 Nov 28;39(3):548-558. doi: 10.1093/humrep/dead248.

Bayesian evaluation of effect size after replicating an original study.重复原始研究后效应量的贝叶斯评估。

PLoS One. 2017 Apr 7;12(4):e0175302. doi: 10.1371/journal.pone.0175302. eCollection 2017.

Publication bias in psychological science: prevalence, methods for identifying and controlling, and implications for the use of meta-analyses.心理学科学中的发表偏倚：普遍性、识别和控制方法，以及对元分析使用的影响。

Psychol Methods. 2012 Mar;17(1):120-8. doi: 10.1037/a0024445. Epub 2011 Jul 25.

Redefining significance and reproducibility for medical research: A plea for higher P-value thresholds for diagnostic and prognostic models.重新定义医学研究的意义和可重复性：呼吁提高诊断和预后模型的 P 值阈值。

Eur J Clin Invest. 2020 May;50(5):e13229. doi: 10.1111/eci.13229. Epub 2020 May 9.

引用本文的文献

A hybrid approach for pattern recognition and interpretation in age-related false memory.一种用于与年龄相关的错误记忆的模式识别和解释的混合方法。

Front Psychol. 2025 Jul 23;16:1579259. doi: 10.3389/fpsyg.2025.1579259. eCollection 2025.

Network analysis of interactions of rumination and anxiety on smartphone dependence symptoms.关于反刍思维与焦虑对智能手机依赖症状影响的交互作用的网络分析

Front Psychiatry. 2025 Feb 4;16:1506721. doi: 10.3389/fpsyt.2025.1506721. eCollection 2025.

Application of machine learning techniques for warfarin dosage prediction: a case study on the MIMIC-III dataset.机器学习技术在华法林剂量预测中的应用：以MIMIC-III数据集为例的研究

PeerJ Comput Sci. 2025 Jan 2;11:e2612. doi: 10.7717/peerj-cs.2612. eCollection 2025.

Reliably measuring learning-dependent distractor suppression with eye tracking.使用眼动追踪可靠地测量与学习相关的干扰抑制。

Behav Res Methods. 2024 Dec 18;57(1):18. doi: 10.3758/s13428-024-02552-8.

REFORMS: Consensus-based Recommendations for Machine-learning-based Science.改革：基于共识的机器学习科学建议。

Sci Adv. 2024 May 3;10(18):eadk3452. doi: 10.1126/sciadv.adk3452. Epub 2024 May 1.

Power priors for replication studies.复制研究的幂先验。

Test (Madr). 2024;33(1):127-154. doi: 10.1007/s11749-023-00888-5. Epub 2023 Sep 21.

Reliably Measuring Learning-Dependent Distractor Suppression with Eye Tracking.利用眼动追踪可靠地测量与学习相关的干扰抑制

bioRxiv. 2024 Feb 27:2024.02.23.581757. doi: 10.1101/2024.02.23.581757.

Brains over beauty: A preregistered test of the effects of objectification on women's cognitive performance.美貌偏见：对女性认知表现产生影响的客观化效应的预先注册测试。

PLoS One. 2023 Sep 21;18(9):e0291853. doi: 10.1371/journal.pone.0291853. eCollection 2023.

Intolerance of uncertainty fuels preservice teachers' smartphone dependence through rumination and anxiety during the COVID-19 pandemic: A cross-sectional study.在新冠疫情期间，不确定性不耐受通过沉思和焦虑加剧职前教师对智能手机的依赖：一项横断面研究。

Heliyon. 2023 Jun 30;9(7):e17798. doi: 10.1016/j.heliyon.2023.e17798. eCollection 2023 Jul.

Neurodesk: An accessible, flexible, and portable data analysis environment for reproducible neuroimaging.Neurodesk：一个用于可重复神经成像的可访问、灵活且便携的数据分析环境。

Res Sq. 2023 Mar 13:rs.3.rs-2649734. doi: 10.21203/rs.3.rs-2649734/v1.

本文引用的文献

PSYCHOLOGY. Estimating the reproducibility of psychological science.心理学. 心理科学可重复性的评估.

Science. 2015 Aug 28;349(6251):aac4716. doi: 10.1126/science.aac4716.

Editors' Introduction to the Special Section on Replicability in Psychological Science: A Crisis of Confidence?《心理科学中可重复性问题特刊编辑引言：信心危机？》

Perspect Psychol Sci. 2012 Nov;7(6):528-30. doi: 10.1177/1745691612465253.

Social science. Publication bias in the social sciences: unlocking the file drawer.社会科学。社会科学中的发表偏倚：开启档案柜。

Science. 2014 Sep 19;345(6203):1502-5. doi: 10.1126/science.1255484. Epub 2014 Aug 28.

UNIFORMLY MOST POWERFUL BAYESIAN TESTS.一致最强大贝叶斯检验

Ann Stat. 2013;41(4):1716-1741. doi: 10.1214/13-AOS1123.

Reproducibility.可重复性

Science. 2014 Jan 17;343(6168):229. doi: 10.1126/science.1250475.

Revised standards for statistical evidence.修订后的统计证据标准。

Proc Natl Acad Sci U S A. 2013 Nov 26;110(48):19313-7. doi: 10.1073/pnas.1313476110. Epub 2013 Nov 11.

P-curve: a key to the file-drawer.P曲线：文件抽屉问题的关键。

J Exp Psychol Gen. 2014 Apr;143(2):534-47. doi: 10.1037/a0033242. Epub 2013 Jul 15.

Drug development: Raise standards for preclinical cancer research.药物研发：提高临床前癌症研究标准。

Nature. 2012 Mar 28;483(7391):531-3. doi: 10.1038/483531a.

Goodness-of-fit diagnostics for Bayesian hierarchical models.贝叶斯层次模型的拟合优度诊断

Biometrics. 2012 Mar;68(1):156-64. doi: 10.1111/j.1541-0420.2011.01668.x. Epub 2011 Nov 3.

False-positive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant.虚假阳性心理学：在数据收集和分析中不披露的灵活性使得任何事物都可以被呈现为显著的。

Psychol Sci. 2011 Nov;22(11):1359-66. doi: 10.1177/0956797611417632. Epub 2011 Oct 17.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验