Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA.
PLoS Comput Biol. 2011 Jun;7(6):e1002064. doi: 10.1371/journal.pcbi.1002064. Epub 2011 Jun 9.
DNA evolution models made invaluable contributions to comparative genomics, although it seemed formidable to include non-genomic features into these models. In order to build an evolutionary model of transcription networks (TNs), we had to forfeit the substitution model used in DNA evolution and to start from modeling the evolution of the regulatory relationships. We present a quantitative evolutionary model of TNs, subjecting the phylogenetic distance and the evolutionary changes of cis-regulatory sequence, gene expression and network structure to one probabilistic framework. Using the genome sequences and gene expression data from multiple species, this model can predict regulatory relationships between a transcription factor (TF) and its target genes in all species, and thus identify TN re-wiring events. Applying this model to analyze the pre-implantation development of three mammalian species, we identified the conserved and re-wired components of the TNs downstream to a set of TFs including Oct4, Gata3/4/6, cMyc and nMyc. Evolutionary events on the DNA sequence that led to turnover of TF binding sites were identified, including a birth of an Oct4 binding site by a 2nt deletion. In contrast to recent reports of large interspecies differences of TF binding sites and gene expression patterns, the interspecies difference in TF-target relationship is much smaller. The data showed increasing conservation levels from genomic sequences to TF-DNA interaction, gene expression, TN, and finally to morphology, suggesting that evolutionary changes are larger at molecular levels and smaller at functional levels. The data also showed that evolutionarily older TFs are more likely to have conserved target genes, whereas younger TFs tend to have larger re-wiring rates.
DNA 进化模型为比较基因组学做出了不可估量的贡献,尽管将非基因组特征纳入这些模型似乎具有挑战性。为了构建转录网络 (TN) 的进化模型,我们不得不放弃 DNA 进化中使用的替代模型,而从建模调控关系的进化开始。我们提出了一种 TN 的定量进化模型,将系统发育距离和顺式调控序列、基因表达和网络结构的进化变化置于一个概率框架内。使用来自多个物种的基因组序列和基因表达数据,该模型可以预测所有物种中转录因子 (TF)与其靶基因之间的调控关系,并因此识别 TN 重新布线事件。将该模型应用于分析三种哺乳动物物种的胚胎前发育,我们鉴定了一组 TF(包括 Oct4、Gata3/4/6、cMyc 和 nMyc)下游 TN 的保守和重新布线成分。鉴定了导致 TF 结合位点更替的 DNA 序列上的进化事件,包括通过 2nt 缺失产生的 Oct4 结合位点。与最近关于 TF 结合位点和基因表达模式在物种间存在较大差异的报道相反,TF-靶关系在物种间的差异要小得多。数据显示,从基因组序列到 TF-DNA 相互作用、基因表达、TN,最后到形态,保守水平逐渐增加,这表明进化变化在分子水平上较大,而在功能水平上较小。数据还表明,进化较老的 TF 更有可能具有保守的靶基因,而较年轻的 TF 往往具有更高的重新布线率。