School of Mathematics and Physics, University of Tasmania, Tasmania, Australia.
Bull Math Biol. 2012 Apr;74(4):858-80. doi: 10.1007/s11538-011-9691-z.
It is known that the Kimura 3ST model of sequence evolution on phylogenetic trees can be extended quite naturally to arbitrary split systems. However, this extension relies heavily on mathematical peculiarities of the associated Hadamard transformation, and providing an analogous augmentation of the general Markov model has thus far been elusive. In this paper, we rectify this shortcoming by showing how to extend the general Markov model on trees to include incompatible edges; and even further to more general network models. This is achieved by exploring the algebra of the generators of the continuous-time Markov chain together with the “splitting” operator that generates the branching process on phylogenetic trees. For simplicity, we proceed by discussing the two state case and then show that our results are easily extended to more states with little complication. Intriguingly, upon restriction of the two state general Markov model to the parameter space of the binary symmetric model, our extension is indistinguishable from the Hadamard approach only on trees; as soon as any incompatible splits are introduced the two approaches give rise to differing probability distributions with disparate structure. Through exploration of a simple example, we give an argument that our extension to more general networks has desirable properties that the previous approaches do not share. In particular, our construction allows for convergent evolution of previously divergent lineages; a property that is of significant interest for biological applications.
已知,系统发育树上 Kimura 3ST 序列进化模型可以很自然地扩展到任意分裂系统。然而,这种扩展严重依赖于相关 Hadamard 变换的数学特殊性,因此迄今为止,提供一般马尔可夫模型的类似扩充一直难以捉摸。在本文中,我们通过展示如何将树状上的一般马尔可夫模型扩展到包括不兼容边,甚至扩展到更一般的网络模型,来纠正这一缺点。这是通过探索连续时间马尔可夫链的生成元代数以及在系统发育树上生成分支过程的“分裂”算子来实现的。为了简单起见,我们先讨论两种状态的情况,然后表明我们的结果很容易扩展到更多的状态,而不会有太多的复杂性。有趣的是,将二状态一般马尔可夫模型限制在二进制对称模型的参数空间内时,我们的扩展与仅在树上的 Hadamard 方法无法区分;一旦引入任何不兼容的分裂,两种方法都会产生具有不同结构的不同概率分布。通过对一个简单例子的探索,我们给出了一个论点,即我们对更一般网络的扩展具有以前的方法所不具备的理想特性。特别是,我们的构造允许先前发散的谱系发生趋同进化;这是生物应用中非常重要的特性。