Squartini Federico, Arndt Peter F
Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany.
Mol Biol Evol. 2008 Dec;25(12):2525-35. doi: 10.1093/molbev/msn169. Epub 2008 Aug 5.
Markov models describing the evolution of the nucleotide substitution process, widely used in phylogeny reconstruction, usually assume the hypotheses of stationarity and time reversibility. Although these models give meaningful results when applied to biological data, it is not clear if the 2 assumptions mentioned above hold and, if not, how much sequence evolution processes deviate from them. To this aim, we introduce 2 sets of indices that can be calculated from the nucleotide distribution and the substitution rates. The stationarity indices (STIs) can be used to test the validity of the equilibrium assumption. The irreversibility indices (IRIs) are derived from the Kolmogorov cycle conditions for time reversibility and quantify the degree of nontime reversibility of a process. We have computed STIs and IRIs for the evolutionary process of 2 lineages, Drosophila simulans and Homo sapiens. In the latter case, we use a modified form of the indices that takes into account the CpG decay process. In both cases, we find statistically significant deviations from the ideal case of a process that has reached stationarity and is time reversible.
描述核苷酸替换过程演变的马尔可夫模型在系统发育重建中广泛应用,通常假定平稳性和时间可逆性假说。尽管这些模型应用于生物学数据时能得出有意义的结果,但上述两个假设是否成立并不明确,若不成立,序列进化过程与它们的偏离程度如何也不清楚。为此,我们引入了两组可根据核苷酸分布和替换率计算得出的指标。平稳性指标(STIs)可用于检验平衡假设的有效性。不可逆性指标(IRIs)源自时间可逆性的柯尔莫哥洛夫循环条件,量化了一个过程的非时间可逆程度。我们计算了果蝇 simulans 和智人这两个谱系进化过程的 STIs 和 IRIs。在后一种情况下,我们使用了考虑 CpG 衰减过程的修正形式指标。在这两种情况下,我们都发现与已达到平稳且时间可逆的理想过程存在统计学上的显著偏差。