Souto-Maior Caetano
Basque Center for Applied Mathematics, Bilbao, Spain.
Laboratory of Systems Genetics, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland, United States.
PeerJ. 2025 Apr 29;13:e18972. doi: 10.7717/peerj.18972. eCollection 2025.
Reports of crises of reproducibility have abounded in the scientific and popular press, and are often attributed to questionable research practices, lack of rigor in protocols, or fraud. On the other hand, it is a known fact that-just like observations in a single biological experiment-outcomes of biological replicates will vary; nevertheless, that variability is rarely assessed formally. Here I argue that some instances of failure to replicate experiments are in fact failures to properly describe the structure of variance. I formalize a hierarchy of distributions that represent the system-level and experiment-level effects, and correctly account for the between-and within-experiment variances, respectively. I also show that this formulation is straightforward to implement and generalize through Bayesian hierarchical models, although it doesn't preclude the use of Frequentist models. One of the main results of this approach is that a set of repetitions of an experiment, instead of being described by irreconcilable string of significant/nonsignificant results, are described and consolidated as a system-level distribution. As a corollary, stronger statements about a system can only be made by analyzing a number of replicates, so I argue that scientists should refrain from making them based on individual experiments.
科学和大众媒体上充斥着关于可重复性危机的报道,这些报道往往归咎于有问题的研究行为、实验方案缺乏严谨性或欺诈行为。另一方面,众所周知,就像单个生物学实验中的观察结果一样,生物学重复实验的结果也会有所不同;然而,这种变异性很少得到正式评估。在这里,我认为一些无法重复实验的情况实际上是未能正确描述方差结构的结果。我将代表系统层面和实验层面效应的分布层次形式化,并分别正确考虑实验间和实验内的方差。我还表明,这种公式通过贝叶斯层次模型很容易实现和推广,尽管它并不排除使用频率主义模型。这种方法的一个主要结果是,一组实验重复,不是由一系列无法调和的显著/不显著结果来描述,而是被描述和整合为一个系统层面的分布。作为一个推论,关于一个系统的更强有力的陈述只能通过分析多个重复实验来做出,所以我认为科学家们不应基于单个实验来做出这些陈述。