Dipartimento di Biologia Evoluzionistica, Universita' degli Studi di Firenze, via Romana 19, 50125 Firenze, Italy.
Mol Phylogenet Evol. 2011 Aug;60(2):228-35. doi: 10.1016/j.ympev.2011.04.015. Epub 2011 May 7.
Nucleotide distributions in genomes is known not to be random, showing the presence of specific motifs, long and short range correlations, periodicities, etc. Particularly, motifs are critical for the recognition by specific proteins affecting chromosome organization, transcription and DNA replication but little is known about the possible functional effects of nucleotide distributions on the conformational landscape of DNA, putatively leading to differential selective pressures throughout evolution. Promoter sequences have a fundamental role in the regulation of gene activity and a vast literature suggests that their conformational landscapes may be a critical factor in gene expression dynamics. On these grounds, with the aim of investigating the putative existence of phylogenetic patterns of promoter base distributions, we analyzed GC/AT ratios along the 1000 nucleotide sequences upstream of TSS in wide sets of promoters belonging to organisms ranging from bacteria to pluricellular eukaryotes. The data obtained showed very clear phylogenetic trends throughout evolution of promoter sequence base distributions. Particularly, in all cases either GC-rich or AT-rich monotone gradients were observed: the former being present in eukaryotes, the latter in bacteria along with strand biases. Moreover, within eukaryotes, GC-rich gradients increased in length from unicellular organisms to plants, to vertebrates and, within them, from ancestral to more recent species. Finally, results were thoroughly discussed with particular attention to the possible correlation between nucleotide distribution patterns, evolution, and the putative existence of differential selection pressures, deriving from structural and/or functional constraints, between and within prokaryotes and eukaryotes.
基因组中的核苷酸分布并非随机,而是存在特定的基序、长程和短程相关性、周期性等。特别是,基序对于特定蛋白质的识别至关重要,这些蛋白质会影响染色体的组织、转录和 DNA 复制,但对于核苷酸分布对 DNA 构象景观的可能功能影响知之甚少,这可能导致在进化过程中产生不同的选择压力。启动子序列在基因活性的调节中起着重要作用,大量文献表明,它们的构象景观可能是基因表达动力学的一个关键因素。基于此,为了研究启动子碱基分布是否存在系统发育模式,我们分析了 TSS 上游 1000 个核苷酸序列中 GC/AT 比值,这些序列来自从细菌到多细胞真核生物等广泛的启动子。研究结果表明,启动子序列碱基分布在进化过程中存在非常明显的系统发育趋势。特别是,在所有情况下,都观察到 GC 丰富或 AT 丰富的单调梯度:前者存在于真核生物中,后者存在于细菌中,同时还存在链偏向性。此外,在真核生物中,从单细胞生物到植物、脊椎动物,GC 丰富的梯度长度逐渐增加,在这些生物中,从古老物种到更近的物种,GC 丰富的梯度长度也在增加。最后,我们对结果进行了深入讨论,特别关注核苷酸分布模式、进化以及结构和/或功能约束下可能存在的差异选择压力之间的相关性,这种相关性存在于原核生物和真核生物之间和内部。