Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3PT UK.
Gurdon Institute, University of Cambridge, Cambridge CB2 1QN, UK.
Stem Cell Reports. 2018 Oct 9;11(4):897-911. doi: 10.1016/j.stemcr.2018.08.013. Epub 2018 Sep 20.
Reproducibility in molecular and cellular studies is fundamental to scientific discovery. To establish the reproducibility of a well-defined long-term neuronal differentiation protocol, we repeated the cellular and molecular comparison of the same two iPSC lines across five distinct laboratories. Despite uncovering acceptable variability within individual laboratories, we detect poor cross-site reproducibility of the differential gene expression signature between these two lines. Factor analysis identifies the laboratory as the largest source of variation along with several variation-inflating confounders such as passaging effects and progenitor storage. Single-cell transcriptomics shows substantial cellular heterogeneity underlying inter-laboratory variability and being responsible for biases in differential gene expression inference. Factor analysis-based normalization of the combined dataset can remove the nuisance technical effects, enabling the execution of robust hypothesis-generating studies. Our study shows that multi-center collaborations can expose systematic biases and identify critical factors to be standardized when publishing novel protocols, contributing to increased cross-site reproducibility.
分子和细胞研究的可重复性是科学发现的基础。为了确定经过明确定义的长期神经元分化方案的可重复性,我们在五个不同的实验室中重复了对相同的两个 iPSC 系进行细胞和分子比较。尽管在单个实验室中发现了可接受的变异性,但我们发现这两个系之间差异基因表达特征的跨站点重现性很差。因子分析确定了实验室是最大的变异来源,以及一些增加变异的混杂因素,如传代效应和祖细胞储存。单细胞转录组学显示,实验室间变异性的基础是细胞间存在大量异质性,这也是差异基因表达推断产生偏差的原因。基于因子分析的组合数据集的归一化可以消除干扰技术效应,从而能够执行稳健的产生假说的研究。我们的研究表明,多中心合作可以揭示系统性偏差,并在发布新协议时确定需要标准化的关键因素,从而提高跨站点的重现性。