Department of Psychology, Northwestern University.
Department of Psychology, University of California, Davis.
J Pers Soc Psychol. 2017 Aug;113(2):244-253. doi: 10.1037/pspi0000075.
Finkel, Eastwick, and Reis (2015; FER2015) argued that psychological science is better served by responding to apprehensions about replicability rates with contextualized solutions than with one-size-fits-all solutions. Here, we extend FER2015's analysis to suggest that much of the discussion of best research practices since 2011 has focused on a single feature of high-quality science-replicability-with insufficient sensitivity to the implications of recommended practices for other features, like discovery, internal validity, external validity, construct validity, consequentiality, and cumulativeness. Thus, although recommendations for bolstering replicability have been innovative, compelling, and abundant, it is difficult to evaluate their impact on our science as a whole, especially because many research practices that are beneficial for some features of scientific quality are harmful for others. For example, FER2015 argued that bigger samples are generally better, but also noted that very large samples ("those larger than required for effect sizes to stabilize"; p. 291) could have the downside of commandeering resources that would have been better invested in other studies. In their critique of FER2015, LeBel, Campbell, and Loving (2016) concluded, based on simulated data, that ever-larger samples are better for the efficiency of scientific discovery (i.e., that there are no tradeoffs). As demonstrated here, however, this conclusion holds only when the replicator's resources are considered in isolation. If we widen the assumptions to include the original researcher's resources as well, which is necessary if the goal is to consider resource investment for the field as a whole, the conclusion changes radically-and strongly supports a tradeoff-based analysis. In general, as psychologists seek to strengthen our science, we must complement our much-needed work on increasing replicability with careful attention to the other features of a high-quality science. (PsycINFO Database Record
芬克尔、伊斯威克和里斯(2015;FER2015)认为,心理学科学通过针对可复制性比率的担忧提供具体情境的解决方案,而不是采用一刀切的解决方案,这样能更好地服务于心理学科学。在这里,我们扩展了 FER2015 的分析,提出自 2011 年以来,关于最佳研究实践的大部分讨论都集中在高质量科学可复制性的单一特征上,而对推荐实践对其他特征(如发现、内部有效性、外部有效性、构念有效性、因果关系和累积性)的影响关注不足。因此,尽管增强可复制性的建议具有创新性、说服力和丰富性,但很难评估它们对我们整个科学的影响,尤其是因为许多有益于科学质量某些特征的研究实践对其他特征是有害的。例如,FER2015 认为更大的样本通常更好,但也指出非常大的样本(“那些大于使效应大小稳定所需的样本”;第 291 页)可能会带来负面影响,即抢占本来可以更好地投资于其他研究的资源。在对 FER2015 的批评中,勒贝尔、坎贝尔和洛夫林(2016)根据模拟数据得出结论,更大的样本对于科学发现的效率(即没有权衡取舍)更好。然而,如这里所展示的,只有当复制者的资源被孤立地考虑时,这个结论才成立。如果我们扩大假设,将原始研究人员的资源也包括在内,这是必要的,如果目标是考虑整个领域的资源投资,那么结论就会发生根本性的变化——并强烈支持基于权衡取舍的分析。一般来说,心理学家在寻求加强我们的科学时,我们必须在增加可复制性方面做大量的工作,同时也要谨慎地关注高质量科学的其他特征。