Bioinformatics Centre, Northwest A&F University, Yangling, Shaanxi, 712100, China.
BMC Genomics. 2010 Jul 5;11:416. doi: 10.1186/1471-2164-11-416.
Sequence context is an important aspect of base mutagenesis, and three-base periodicity is an intrinsic property of coding sequences. However, how three-base periodicity is influenced in the vicinity of substitutions is still unclear. The effect of context on mutagenesis should be revealed in the usage of nucleotides that flank substitutions. Relative entropy (also known as Kullback-Leibler divergence) is useful for finding unusual patterns in biological sequences.
Using relative entropy, we visualized the periodic patterns in the context of substitutions in human orthologous genes. Neighbouring patterns differed both among substitution categories and within a category that occurred at three codon positions. Transition tended to occur in periodic sequences relative to transversion. Periodic signals were stronger in a set of flanking sequences of substitutions that occurred at the third-codon positions than in those that occurred at the first- or second-codon positions. To determine how the three-base periodicity was affected near the substitution sites, we fitted a sine model to the values of the relative entropy. A sine of period equal to 3 is a good approximation for the three-base periodicity at sites not in close vicinity to some substitutions. These periods were interrupted near the substitution site and then reappeared away from substitutions. A comparative analysis between the native and codon-shuffled datasets suggested that the codon usage frequency was not the sole origin of the three-base periodicity, implying that the native order of codons also played an important role in this periodicity. Synonymous codon shuffling revealed that synonymous codon usage bias was one of the factors responsible for the observed three-base periodicity.
Our results offer an efficient way to illustrate unusual periodic patterns in the context of substitutions and provide further insight into the origin of three-base periodicity. This periodicity is a result of the native codon order in the reading frame. The length of the period equal to 3 is caused by the usage bias of nucleotides in synonymous codons. The periodic features in nucleotides surrounding substitutions aid in further understanding genetic variation and nucleotide mutagenesis.
序列上下文是碱基诱变的一个重要方面,而三碱基周期性是编码序列的固有特性。然而,在取代附近,三碱基周期性如何受到影响尚不清楚。上下文对诱变的影响应该在取代侧翼核苷酸的使用中体现出来。相对熵(也称为 Kullback-Leibler 散度)可用于在生物序列中找到异常模式。
我们使用相对熵可视化了人类同源基因中取代上下文的周期性模式。取代类别和类别内的三个密码子位置的邻近模式均有所不同。转换相对于颠换更倾向于发生在周期性序列中。在发生在第三个密码子位置的取代的侧翼序列中,周期性信号比发生在第一个或第二个密码子位置的取代的侧翼序列更强。为了确定在取代位点附近三碱基周期性如何受到影响,我们对相对熵值拟合了正弦模型。周期等于 3 的正弦很好地近似了在没有靠近某些取代的取代位点的三碱基周期性。这些周期在取代位点附近中断,然后在远离取代的地方重新出现。对原始和密码子置换数据集的比较分析表明,密码子使用频率不是三碱基周期性的唯一来源,这意味着密码子的原始顺序在这种周期性中也起着重要作用。同义密码子置换表明同义密码子使用偏好是导致观察到的三碱基周期性的因素之一。
我们的结果提供了一种有效的方法来在取代的背景下展示异常的周期性模式,并进一步深入了解三碱基周期性的起源。这种周期性是阅读框中原始密码子顺序的结果。周期长度等于 3 是由于同义密码子中核苷酸使用偏好造成的。取代周围核苷酸的周期性特征有助于进一步理解遗传变异和核苷酸诱变。