Suppr超能文献

挑战 N 启发式:效应大小而非样本大小,预测心理科学的可重复性。

Challenging the N-Heuristic: Effect size, not sample size, predicts the replicability of psychological science.

机构信息

Graduate School of Education, Stanford University, Stanford, California, United States of America.

School of Humanities and Social Science, The Chinese University of Hong Kong, Shenzhen, Shenzhen, China.

出版信息

PLoS One. 2024 Aug 23;19(8):e0306911. doi: 10.1371/journal.pone.0306911. eCollection 2024.

Abstract

Large sample size (N) is seen as a key criterion in judging the replicability of psychological research, a phenomenon we refer to as the N-Heuristic. This heuristic has led to the incentivization of fast, online, non-behavioral studies-to the potential detriment of psychological science. While large N should in principle increase statistical power and thus the replicability of effects, in practice it may not. Large-N studies may have other attributes that undercut their power or validity. Consolidating data from all systematic, large-scale attempts at replication (N = 307 original-replication study pairs), we find that the original study's sample size did not predict its likelihood of being replicated (rs = -0.02, p = 0.741), even with study design and research area controlled. By contrast, effect size emerged as a substantial predictor (rs = 0.21, p < 0.001), which held regardless of the study's sample size. N may be a poor predictor of replicability because studies with larger N investigated smaller effects (rs = -0.49, p < 0.001). Contrary to these results, a survey of 215 professional psychologists, presenting them with a comprehensive list of methodological criteria, found sample size to be rated as the most important criterion in judging a study's replicability. Our findings strike a cautionary note with respect to the prioritization of large N in judging the replicability of psychological science.

摘要

大样本量(N)被视为判断心理学研究可重复性的关键标准,我们称之为 N 启发式。这种启发式导致了快速、在线、非行为研究的激励——这可能对心理学科学造成潜在的损害。虽然大 N 原则上应该增加统计能力,从而提高效应的可重复性,但实际上可能并非如此。大 N 研究可能具有其他削弱其效力或有效性的属性。我们整合了所有系统的、大规模的复制尝试的数据(N=307 个原始-复制研究对),发现原始研究的样本量并不能预测其被复制的可能性(rs=-0.02,p=0.741),即使控制了研究设计和研究领域。相比之下,效应大小成为一个重要的预测因素(rs=0.21,p<0.001),无论研究的样本量如何,这一结果都成立。N 可能是可重复性的一个糟糕预测指标,因为 N 较大的研究调查了较小的效应(rs=-0.49,p<0.001)。与这些结果相反,一项对 215 名专业心理学家的调查向他们展示了一份全面的方法学标准清单,发现样本量被评为判断研究可重复性的最重要标准。我们的发现对在判断心理学科学的可重复性时优先考虑大 N 敲响了警钟。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/809c/11343368/1dff1c872313/pone.0306911.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验