Institute of Nano Science, State Key Laboratory of Mechanics and Control of Mechanical Structures, and College of Science, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China.
Gene. 2012 Nov 1;509(1):136-41. doi: 10.1016/j.gene.2012.07.075. Epub 2012 Aug 10.
Codon models are now widely used to draw evolutionary inferences from alignments of homologous sequence data. Incorporating physicochemical properties of amino acids into codon models, two novel codon substitution models describing the evolution of protein-coding DNA sequences are presented based on the similarity scores of amino acids. To describe substitutions between codons a continue-time Markov process is used. Transition/transversion rate bias and nonsynonymous codon usage bias are allowed in the models. In our implementation, the parameters are estimated by maximum-likelihood (ML) method as in previous studies. Furthermore, instantaneous mutations involving more than one nucleotide position of a codon are considered in the second model. Then the two suggested models are applied to five real data sets. The analytic results indicate that the new codon models considering physicochemical properties of amino acids can provide a better fit to the data comparing with existing codon models, and then produce more reliable estimates of certain biologically important measures than existing methods.
现在,密码子模型被广泛用于从同源序列数据的比对中得出进化推论。本研究将氨基酸的理化性质纳入密码子模型,基于氨基酸相似性评分,提出了两种新的描述蛋白质编码 DNA 序列进化的密码子替代模型。为了描述密码子之间的替换,我们使用连续时间马尔可夫过程。模型中允许转换/颠换率偏向和非同义密码子使用偏向。在我们的实现中,参数通过最大似然(ML)方法进行估计,与之前的研究相同。此外,在第二个模型中,还考虑了涉及一个密码子中多个核苷酸位置的瞬时突变。然后,将这两种建议的模型应用于五个真实数据集。分析结果表明,与现有的密码子模型相比,考虑氨基酸理化性质的新密码子模型可以更好地拟合数据,并比现有方法产生更可靠的某些生物学重要指标的估计值。