Shah Kushal, Krishnamachari Annangarachari
School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India.
Biosystems. 2012 Mar;107(3):142-4. doi: 10.1016/j.biosystems.2011.11.006. Epub 2011 Nov 12.
Genomes of almost all organisms have been found to exhibit several periodicities, the most prominent one is the three base periodicity. It is more pronounced in the gene coding regions and has been exploited to identify the segments of a genome that code for a protein. The reason for this three base periodicity in the gene-coding region has been attributed to inhomogeneous nucleotide compositions in the three codon positions. However, this reason cannot explain the three base periodicity present at the level of the whole genome where the codon concept is not applicable. Even though the distribution of each nucleotide is uniform at the positions 0(mod 3), 1(mod 3) and 2(mod 3) when the whole genome data is considered, our analysis reveals that the three base periodicity is arising because of higher correlations among the nucleotides separated by three bases.
几乎所有生物体的基因组都呈现出几种周期性,其中最显著的是三联体碱基周期性。这种周期性在基因编码区更为明显,并已被用于识别基因组中编码蛋白质的片段。基因编码区出现这种三联体碱基周期性的原因被认为是三个密码子位置上核苷酸组成的不均匀性。然而,这个原因无法解释在整个基因组水平上存在的三联体碱基周期性,因为在整个基因组水平上密码子概念并不适用。尽管从整个基因组数据来看,每个核苷酸在0(mod 3)、1(mod 3)和2(mod 3)位置的分布是均匀的,但我们的分析表明,三联体碱基周期性的出现是由于相隔三个碱基的核苷酸之间存在更高的相关性。