Gilad Yoav, Mizrahi-Man Orna
Department of Human Genetics, University of Chicago, Chicago, IL, 60637, USA.
F1000Res. 2015 May 19;4:121. doi: 10.12688/f1000research.6536.1. eCollection 2015.
Recently, the Mouse ENCODE Consortium reported that comparative gene expression data from human and mouse tend to cluster more by species rather than by tissue. This observation was surprising, as it contradicted much of the comparative gene regulatory data collected previously, as well as the common notion that major developmental pathways are highly conserved across a wide range of species, in particular across mammals. Here we show that the Mouse ENCODE gene expression data were collected using a flawed study design, which confounded sequencing batch (namely, the assignment of samples to sequencing flowcells and lanes) with species. When we account for the batch effect, the corrected comparative gene expression data from human and mouse tend to cluster by tissue, not by species.
最近,小鼠ENCODE联盟报告称,来自人类和小鼠的比较基因表达数据往往按物种聚类,而非按组织聚类。这一观察结果令人惊讶,因为它与之前收集的许多比较基因调控数据相矛盾,也与主要发育途径在广泛物种中,尤其是在哺乳动物中高度保守的普遍观念相矛盾。在这里,我们表明,小鼠ENCODE基因表达数据是使用有缺陷的研究设计收集的,该设计将测序批次(即样本分配到测序流动槽和泳道)与物种混淆。当我们考虑批次效应时,来自人类和小鼠的校正后比较基因表达数据往往按组织聚类,而非按物种聚类。