Goldman Nick, Whelan Simon
Department of Zoology, University of Cambridge, UK.
Mol Biol Evol. 2002 Nov;19(11):1821-31. doi: 10.1093/oxfordjournals.molbev.a004007.
Current mathematical models of amino acid sequence evolution are often applied in variants that match their expected amino acid frequencies to those observed in a data set under analysis. This has been achieved by setting the instantaneous rate of replacement of a residue i by another residue j proportional to the observed frequency of the resulting residue j. We describe a more general method that maintains the match between expected and observed frequencies but permits replacement rates to be proportional to the frequencies of both the replaced and resulting residues, raised to powers other than 1. Analysis of a database of amino acid alignments shows that the description of the evolutionary process in a majority (approximately 70% of 182 alignments) is significantly improved by use of the new method, and a variety of analyses indicate that parameter estimation with the new method is well-behaved. Improved evolutionary models increase our understanding of the process of molecular evolution and are often expected to lead to improved phylogenetic inferences, and so it seems justified to consider our new variants of existing standard models when performing evolutionary analyses of amino acid sequences. Similar methods can be used with nucleotide substitution models, but we have not found these to give corresponding significant improvements to our ability to describe the processes of nucleotide sequence evolution.
当前氨基酸序列进化的数学模型通常应用于这样的变体,即其预期的氨基酸频率与分析数据集中观察到的频率相匹配。这是通过将残基i被另一个残基j替换的瞬时速率设置为与所得残基j的观察频率成比例来实现的。我们描述了一种更通用的方法,该方法保持预期频率与观察频率之间的匹配,但允许替换速率与被替换残基和所得残基的频率成比例,并提升到1以外的幂次。对氨基酸比对数据库的分析表明,使用新方法可显著改善大多数(182个比对中的约70%)进化过程的描述,并且各种分析表明,使用新方法进行参数估计表现良好。改进的进化模型增进了我们对分子进化过程的理解,并且通常有望带来改进的系统发育推断,因此在对氨基酸序列进行进化分析时,考虑现有标准模型的新变体似乎是合理的。类似的方法也可用于核苷酸替换模型,但我们发现这些方法并未对我们描述核苷酸序列进化过程的能力带来相应的显著改进。