Killeen Peter R
Arizona State University, 405 Marcus Drive, Prescott, AZ 86303 USA.
Perspect Behav Sci. 2018 Sep 5;42(1):109-132. doi: 10.1007/s40614-018-0171-8. eCollection 2019 Mar.
Scientists abstract hypotheses from observations of the world, which they then deploy to test their reliability. The best way to test reliability is to an effect before it occurs. If we can manipulate the independent variables (the efficient causes) that make it occur, then ability to predict makes it possible to . Such control helps to isolate the relevant variables. also refers to a comparison condition, conducted to see what would have happened if we had not deployed the key ingredient of the hypothesis: scientific knowledge only accrues when we compare what happens in one condition against what happens in another. When the results of such comparisons are not definitive, metrics of the degree of efficacy of the manipulation are required. Many of those derive from statistical inference, and many of those poorly serve the purpose of the cumulation of knowledge. Without ability to an effect, the utility of the principle used to predict or control is dubious. Traditional models of statistical inference are weak guides to replicability and utility of results. Several alternatives to null hypothesis testing are sketched: Bayesian, model comparison, and predictive inference ( ). Predictive inference shows, for example, that the failure to replicate most results in the Open Science Project was predictable. Replicability is but one aspect of scientific understanding: it establishes the reliability of our data and the predictive ability of our formal models. It is a necessary aspect of scientific progress, even if not by itself sufficient for understanding.
科学家们从对世界的观察中提炼出假设,然后运用这些假设来检验其可靠性。检验可靠性的最佳方法是在一种效应发生之前对其进行预测。如果我们能够操控导致该效应发生的自变量(即有效原因),那么预测能力就能使我们进行控制。这种控制有助于分离出相关变量。控制还涉及一种对照条件,设置该条件是为了查看如果我们没有运用假设的关键要素会发生什么情况:只有当我们将一种条件下发生的情况与另一种条件下发生的情况进行比较时,科学知识才会积累。当这种比较的结果不明确时,就需要衡量操控效果程度的指标。其中许多指标源自统计推断,但其中许多指标对知识积累的作用不大。如果没有对一种效应进行控制的能力,用于预测或控制的原理的效用就值得怀疑。传统的统计推断模型对于结果的可重复性和效用而言是薄弱的指导。文中概述了几种替代零假设检验的方法:贝叶斯方法、模型比较和预测性推断( )。例如,预测性推断表明,开放科学项目中大多数结果无法重复是可预测的。可重复性只是科学理解的一个方面:它确立了我们数据的可靠性以及我们形式模型的预测能力。它是科学进步的一个必要方面,即便仅凭其自身并不足以实现理解。