Department of Psychology, University of California San Diego, La Jolla, CA 92093
Department of Psychology, University of California San Diego, La Jolla, CA 92093.
Proc Natl Acad Sci U S A. 2020 Mar 17;117(11):5559-5567. doi: 10.1073/pnas.1914237117. Epub 2020 Mar 3.
The perceived replication crisis and the reforms designed to address it are grounded in the notion that science is a binary signal detection problem. However, contrary to null hypothesis significance testing (NHST) logic, the magnitude of the underlying effect size for a given experiment is best conceptualized as a random draw from a continuous distribution, not as a random draw from a dichotomous distribution (null vs. alternative). Moreover, because continuously distributed effects selected using a < 0.05 filter must be inflated, the fact that they are smaller when replicated (reflecting regression to the mean) is no reason to sound the alarm. Considered from this perspective, recent replication efforts suggest that most published < 0.05 scientific findings are "true" (i.e., in the correct direction), with observed effect sizes that are inflated to varying degrees. We propose that original science is a screening process, one that adopts NHST logic as a useful fiction for selecting true effects that are potentially large enough to be of interest to other scientists. Unlike original science, replication science seeks to precisely measure the underlying effect size associated with an experimental protocol via large- direct replication, without regard for statistical significance. Registered reports are well suited to (often resource-intensive) direct replications, which should focus on influential findings and be published regardless of outcome. Conceptual replications play an important but separate role in validating theories. However, because they are part of NHST-based original science, conceptual replications cannot serve as the field's self-correction mechanism. Only direct replications can do that.
被感知的复制危机以及为解决这一危机而进行的改革,其基础是这样一种观念,即科学是一个二元信号检测问题。然而,与零假设显著性检验(NHST)逻辑相反,给定实验的潜在效应大小最好被概念化为来自连续分布的随机抽取,而不是来自二分分布(零假设与备择假设)的随机抽取。此外,由于使用 < 0.05 滤波器选择的连续分布效应必然会被夸大,因此当它们被复制时(反映出向均值回归)较小,这并不是发出警报的理由。从这个角度来看,最近的复制努力表明,大多数已发表的 < 0.05 的科学发现都是“真实的”(即,朝着正确的方向),观察到的效应大小在不同程度上被夸大了。我们提出,原始科学是一个筛选过程,它采用 NHST 逻辑作为一种有用的虚构,以选择潜在足够大、可能引起其他科学家兴趣的真实效应。与原始科学不同,复制科学旨在通过大型直接复制,精确测量与实验方案相关的潜在效应大小,而不考虑统计显著性。注册报告非常适合(通常资源密集型)直接复制,直接复制应侧重于有影响力的发现,无论结果如何都应发表。概念复制在验证理论方面发挥着重要但独立的作用。然而,由于它们是基于 NHST 的原始科学的一部分,概念复制不能作为该领域的自我修正机制。只有直接复制才能做到这一点。