使用广义吉布斯采样器对系统发育树空间进行采样。
Sampling phylogenetic tree space with the generalized Gibbs sampler.
作者信息
Keith Jonathan M, Adams Peter, Ragan Mark A, Bryant Darryn
机构信息
Department of Mathematics, University of Queensland, St. Lucia, Qld 4072, Australia.
出版信息
Mol Phylogenet Evol. 2005 Mar;34(3):459-68. doi: 10.1016/j.ympev.2004.11.016. Epub 2005 Jan 8.
The generalized Gibbs sampler (GGS) is a recently developed Markov chain Monte Carlo (MCMC) technique that enables Gibbs-like sampling of state spaces that lack a convenient representation in terms of a fixed coordinate system. This paper describes a new sampler, called the tree sampler, which uses the GGS to sample from a state space consisting of phylogenetic trees. The tree sampler is useful for a wide range of phylogenetic applications, including Bayesian, maximum likelihood, and maximum parsimony methods. A fast new algorithm to search for a maximum parsimony phylogeny is presented, using the tree sampler in the context of simulated annealing. The mathematics underlying the algorithm is explained and its time complexity is analyzed. The method is tested on two large data sets consisting of 123 sequences and 500 sequences, respectively. The new algorithm is shown to compare very favorably in terms of speed and accuracy to the program DNAPARS from the PHYLIP package.
广义吉布斯采样器(GGS)是一种最近开发的马尔可夫链蒙特卡罗(MCMC)技术,它能够对那些在固定坐标系下缺乏便捷表示的状态空间进行类似吉布斯的采样。本文描述了一种名为树采样器的新采样器,它使用GGS从由系统发育树组成的状态空间中进行采样。树采样器对于广泛的系统发育应用非常有用,包括贝叶斯、最大似然和最大简约方法。提出了一种快速的新算法,用于在模拟退火的背景下使用树采样器搜索最大简约系统发育树。解释了该算法背后的数学原理并分析了其时间复杂度。该方法在分别由123个序列和500个序列组成的两个大数据集上进行了测试。结果表明,新算法在速度和准确性方面与PHYLIP软件包中的程序DNAPARS相比具有很大优势。