Department of Physiology and Program in Neuroscience, University of Maryland School of Medicine, Baltimore, MD, 21201
eNeuro. 2020 Jul 23;7(4). doi: 10.1523/ENEURO.0357-19.2020. Print 2020 Jul/Aug.
Science needs to understand the strength of its findings. This essay considers the evaluation of studies that test scientific (not statistical) hypotheses. A scientific hypothesis is a putative explanation for an observation or phenomenon; it makes (or "entails") testable predictions that must be true if the hypothesis is true and that lead to its rejection if they are false. The question is, "how should we judge the strength of a hypothesis that passes a series of experimental tests?" This question is especially relevant in view of the "reproducibility crisis" that is the cause of great unease. Reproducibility is said to be a dire problem because major neuroscience conclusions supposedly rest entirely on the outcomes of single, valued statistical tests. To investigate this concern, I propose to (1) ask whether neuroscience typically does base major conclusions on single tests; (2) discuss the advantages of testing multiple predictions to evaluate a hypothesis; and (3) review ways in which multiple outcomes can be combined to assess the overall strength of a project that tests multiple predictions of one hypothesis. I argue that scientific hypothesis testing in general, and combining the results of several experiments in particular, may justify placing greater confidence in multiple-testing procedures than in other ways of conducting science.
科学需要了解其发现的力度。本文探讨了评估测试科学(而非统计)假设的研究。科学假设是对观察或现象的假设解释;它提出(或“蕴涵”)了可测试的预测,如果假设为真,则这些预测必须为真,如果为假,则假设将被拒绝。问题是,“如果一个假设通过了一系列实验测试,我们应该如何判断该假设的强度?”鉴于“可重复性危机”引起了极大的不安,这个问题尤其重要。可重复性据说存在严重问题,因为主要的神经科学结论据称完全基于单个有价值的统计测试的结果。为了研究这个问题,我提议(1)询问神经科学是否通常基于单个测试得出主要结论;(2)讨论测试多个预测以评估假设的优势;(3)审查结合多个结果以评估测试一个假设的多个预测的项目的整体强度的方法。我认为,一般来说,科学假设检验,特别是特别是结合几个实验的结果,可以证明比其他科学方法更有理由对多次测试程序有更大的信心。