Institute for Information Transmission Problems, RAS, Bolshoi Karetny per. 19, Moscow 127994, Russia.
BMC Evol Biol. 2012 Oct 6;12:200. doi: 10.1186/1471-2148-12-200.
The exponential growth of the number of fully sequenced genomes at varying taxonomic closeness allows one to characterize transcriptional regulation using comparative-genomics analysis instead of time-consuming experimental methods. A transcriptional regulatory unit consists of a transcription factor, its binding site and a regulated gene. These units constitute a graph which contains so-called "network motifs", subgraphs of a given structure. Here we consider genomes of closely related Enterobacteriales and estimate the fraction of conserved network motifs and sites as well as positions under selection in various types of non-coding regions.
Using a newly developed technique, we found that the highest fraction of positions under selection, approximately 50%, was observed in synvergon spacers (between consecutive genes from the same strand), followed by ~45% in divergon spacers (common 5'-regions), and ~10% in convergon spacers (common 3'-regions). The fraction of selected positions in functional regions was higher, 60% in transcription factor-binding sites and ~45% in terminators and promoters. Small, but significant differences were observed between Escherichia coli and Salmonella enterica. This fraction is similar to the one observed in eukaryotes.The conservation of binding sites demonstrated some differences between types of regulatory units. In E. coli, strains the interactions of the type "local transcriptional factor gene" turned out to be more conserved in feed-forward loops (FFLs) compared to non-motif interactions. The coherent FFLs tend to be less conserved than the incoherent FFLs. A natural explanation is that the former imply functional redundancy.
A naïve hypothesis that FFL would be highly conserved turned out to be not entirely true: its conservation depends on its status in the transcriptional network and also from its usage. The fraction of positions under selection in intergenic regions of bacterial genomes is roughly similar to that of eukaryotes. Known regulatory sites explain 20±5% of selected positions.
随着数量不断增加的完全测序基因组在不同分类学上的接近程度,人们可以使用比较基因组学分析来描述转录调控,而不是使用耗时的实验方法。转录调控单元由转录因子、其结合位点和受调控的基因组成。这些单元构成了一个包含所谓“网络基元”的图,即给定结构的子图。在这里,我们考虑了密切相关的肠杆菌目中的基因组,并估计了各种类型的非编码区域中保守的网络基元和位点以及受选择的位置的分数。
使用新开发的技术,我们发现,在顺式间隔子(同一链上的连续基因之间)中观察到的受选择位置的比例最高,约为 50%,其次是在反式间隔子(共同的 5'区域)中,约为 45%,在转换间隔子(共同的 3'区域)中,约为 10%。在功能区域中,选择位置的比例更高,转录因子结合位点为 60%,终止子和启动子为~45%。在大肠杆菌和沙门氏菌之间观察到了一些小但显著的差异。这一比例与真核生物观察到的相似。
结合位点的保守性表明,不同类型的调控单元之间存在一些差异。在大肠杆菌中,“局部转录因子基因”类型的相互作用在正反馈环(FFL)中比非基元相互作用更为保守。相干的 FFL 往往不如非相干的 FFL 保守。一个自然的解释是,前者意味着功能冗余。
一个天真的假设是,FFL 会高度保守,事实证明这并不完全正确:它的保守性取决于它在转录网络中的状态,也取决于它的使用。细菌基因组中基因间区域的选择位置的分数大致与真核生物相似。已知的调控位点解释了 20±5%的选择位置。