Herbei Radu, Kubatko Laura
The Ohio State University – Statistics, Columbus, OH, USA.
Stat Appl Genet Mol Biol. 2013 Mar 26;12(1):39-48. doi: 10.1515/sagmb-2012-0023.
Markov chains are widely used for modeling in many areas of molecular biology and genetics. As the complexity of such models advances, it becomes increasingly important to assess the rate at which a Markov chain converges to its stationary distribution in order to carry out accurate inference. A common measure of convergence to the stationary distribution is the total variation distance, but this measure can be difficult to compute when the state space of the chain is large. We propose a Monte Carlo method to estimate the total variation distance that can be applied in this situation, and we demonstrate how the method can be efficiently implemented by taking advantage of GPU computing techniques. We apply the method to two Markov chains on the space of phylogenetic trees, and discuss the implications of our findings for the development of algorithms for phylogenetic inference.
马尔可夫链在分子生物学和遗传学的许多领域中被广泛用于建模。随着此类模型复杂性的提高,为了进行准确的推断,评估马尔可夫链收敛到其平稳分布的速率变得越来越重要。收敛到平稳分布的一个常用度量是总变差距离,但当链的状态空间很大时,这个度量可能很难计算。我们提出了一种蒙特卡罗方法来估计在这种情况下可以应用的总变差距离,并展示了如何利用GPU计算技术有效地实现该方法。我们将该方法应用于系统发育树空间上的两个马尔可夫链,并讨论了我们的发现对系统发育推断算法开发的影响。