Zhang Shang-Hong, Wang Lei
Key Laboratory of Gene Engineering of Ministry of Education, and Biotechnology Research Center, Sun Yat-sen University, Guangzhou, 510275, China.
BMC Res Notes. 2012 Nov 17;5:639. doi: 10.1186/1756-0500-5-639.
It was reported that there is a majority profile for trinucleotide frequencies among genomes. And further study has revealed that two common profiles, rather than one majority profile, exist for genomic trinucleotide frequencies. However, the origins of the common/majority profile remain elusive. Moreover, it is not clear whether the features of common profile may be extended to oligonucleotides other than trinucleotides.
We analyzed 571 prokaryotic genomes (chromosomes) and some selected eukaryotic nuclear genomes as well as other genetic systems to study their compositional features. We found that there are also two common profiles for genomic oligonucleotide frequencies: one is from low-GC content genomes, and the other is from high-GC content genomes. Furthermore, each common profile is highly correlated to the average profile of random sequences with corresponding GC content and generated according to first-order symmetry.
The causes for the existence of two common profiles would mainly be GC content variations and strand symmetry of genomic sequences. Therefore, both GC content and strand symmetry would play important roles in genome evolution.
据报道,基因组中三核苷酸频率存在一种主要模式。进一步研究表明,基因组三核苷酸频率存在两种常见模式,而非一种主要模式。然而,常见/主要模式的起源仍然不明。此外,尚不清楚常见模式的特征是否可扩展到三核苷酸以外的寡核苷酸。
我们分析了571个原核生物基因组(染色体)、一些选定的真核生物核基因组以及其他遗传系统,以研究它们的组成特征。我们发现,基因组寡核苷酸频率也存在两种常见模式:一种来自低GC含量基因组,另一种来自高GC含量基因组。此外,每种常见模式都与具有相应GC含量并根据一阶对称性生成的随机序列的平均模式高度相关。
两种常见模式存在的原因主要是基因组序列的GC含量变化和链对称性。因此,GC含量和链对称性在基因组进化中都将发挥重要作用。