Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany.
Structures and Computational Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany.
Mol Syst Biol. 2020 Aug;16(8):e9539. doi: 10.15252/msb.20209539.
For most biological processes, organisms must respond to extrinsic cues, while maintaining essential gene expression programmes. Although studied extensively in single cells, it is still unclear how variation is controlled in multicellular organisms. Here, we used a machine-learning approach to identify genomic features that are predictive of genes with high versus low variation in their expression across individuals, using bulk data to remove stochastic cell-to-cell variation. Using embryonic gene expression across 75 Drosophila isogenic lines, we identify features predictive of expression variation (controlling for expression level), many of which are promoter-related. Genes with low variation fall into two classes reflecting different mechanisms to maintain robust expression, while genes with high variation seem to lack both types of stabilizing mechanisms. Applying this framework to humans revealed similar predictive features, indicating that promoter architecture is an ancient mechanism to control expression variation. Remarkably, expression variation features could also partially predict differential expression after diverse perturbations in both Drosophila and humans. Differential gene expression signatures may therefore be partially explained by genetically encoded gene-specific features, unrelated to the studied treatment.
对于大多数生物过程,生物体必须对外界线索做出反应,同时维持基本的基因表达程序。尽管在单细胞中进行了广泛研究,但在多细胞生物中,变异是如何控制的仍不清楚。在这里,我们使用机器学习方法来识别基因组特征,这些特征可以预测个体间表达差异较大的基因,使用批量数据消除随机的细胞间变异。使用 75 个果蝇同基因系的胚胎基因表达数据,我们确定了可以预测表达变异的特征(考虑到表达水平),其中许多与启动子有关。具有低变异的基因分为两类,反映了维持稳健表达的不同机制,而具有高变异的基因似乎缺乏这两种稳定机制。将这一框架应用于人类揭示了类似的预测特征,表明启动子结构是控制表达变异的一种古老机制。值得注意的是,表达变异特征还可以部分预测果蝇和人类在多种扰动后的差异表达。因此,差异基因表达特征可能部分可以通过与所研究的处理无关的、遗传编码的基因特异性特征来解释。