Delport Wayne, Scheffler Konrad, Seoighe Cathal
University of Cape Town, Observatory, 7925, Cape Town, South Africa.
Brief Bioinform. 2009 Jan;10(1):97-109. doi: 10.1093/bib/bbn049. Epub 2008 Oct 29.
Probabilistic models of sequence evolution are in widespread use in phylogenetics and molecular sequence evolution. These models have become increasingly sophisticated and combined with statistical model comparison techniques have helped to shed light on how genes and proteins evolve. Models of codon evolution have been particularly useful, because, in addition to providing a significant improvement in model realism for protein-coding sequences, codon models can also be designed to test hypotheses about the selective pressures that shape the evolution of the sequences. Such models typically assume a phylogeny and can be used to identify sites or lineages that have evolved adaptively. Recently some of the key assumptions that underlie phylogenetic tests of selection have been questioned, such as the assumption that the rate of synonymous changes is constant across sites or that a single phylogenetic tree can be assumed at all sites for recombining sequences. While some of these issues have been addressed through the development of novel methods, others remain as caveats that need to be considered on a case-by-case basis. Here, we outline the theory of codon models and their application to the detection of positive selection. We review some of the more recent developments that have improved their power and utility, laying a foundation for further advances in the modeling of coding sequence evolution.
序列进化的概率模型在系统发育学和分子序列进化中得到了广泛应用。这些模型变得越来越复杂,并且与统计模型比较技术相结合,有助于阐明基因和蛋白质是如何进化的。密码子进化模型尤其有用,因为除了能显著提高蛋白质编码序列模型的真实性外,密码子模型还可用于检验有关塑造序列进化的选择压力的假设。此类模型通常假定一个系统发育关系,并可用于识别适应性进化的位点或谱系。最近,一些作为选择的系统发育检验基础的关键假设受到了质疑,比如同义变化速率在各个位点恒定的假设,或者对于重组序列在所有位点都可假定单一系统发育树的假设。虽然其中一些问题已通过开发新方法得到解决,但其他问题仍然是需要根据具体情况加以考虑的注意事项。在此,我们概述密码子模型的理论及其在检测正选择中的应用。我们回顾了一些最近的进展,这些进展提高了它们的能力和实用性,为编码序列进化建模的进一步发展奠定了基础。