Goode Matthew, Guindon Stéphane, Rodrigo Allen
The Bioinformatics Institute New Zealand, University of Auckland, Auckland, New Zealand.
Genome Inform. 2008;21:150-64.
Models of nucleotide or amino acid sequence evolution that implement homogeneous and stationary Markov processes of substitutions are mathematically convenient but are unlikely to represent the true complexity of evolution. With the large amounts of data that next generation sequencing promises, appropriate models of evolution are important, particularly when data are collected from ancient and sub-fossil remains, where changes in evolutionary parameters are the norm and not the exception. In this paper, we describe a new codon-based model of evolution that applies to Measurably Evolving Populations (MEPs). A MEP is defined as a population from which it is possible to detect a statistically significant accumulation of substitutions when sequences are obtained at different times. The new model of codon evolution permits changes to the substitution process, including changes to the intensity of selection and the proportions of sites undergoing different selective pressures. In our serial model of codon evolution, changes in the selective regime occur simultaneously across all lineages. Different regions of the protein may also evolve under distinct selective patterns. We illustrate the application of the new model to a dataset of HIV-1 sequences obtained from an infected individual before and after the commencement of antiretroviral therapy.
实施核苷酸或氨基酸序列替换的齐次平稳马尔可夫过程的进化模型在数学上很方便,但不太可能代表进化的真正复杂性。随着下一代测序所带来的大量数据,合适的进化模型很重要,特别是当数据是从古代和亚化石残骸中收集时,进化参数的变化是常态而非例外。在本文中,我们描述了一种新的基于密码子的进化模型,该模型适用于可测量进化种群(MEP)。MEP被定义为这样一个种群,当在不同时间获得序列时,可以从中检测到统计学上显著的替换积累。新的密码子进化模型允许替换过程发生变化,包括选择强度的变化以及经历不同选择压力的位点比例的变化。在我们的密码子进化序列模型中,选择机制的变化在所有谱系中同时发生。蛋白质的不同区域也可能在不同的选择模式下进化。我们说明了新模型在一个HIV-1序列数据集上的应用,该数据集是从一名感染个体在抗逆转录病毒治疗开始前后获得的。