Suppr超能文献

PhyloBayes MPI:在并行环境中使用分布的无限混合进行系统发育重建。

PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment.

机构信息

Centre Robert Cedergren pour la Bioinformatique, Département de Biochimie, Université de Montréal, C.P. 6128, Succursale Centre-ville. Montréal, Québec H3C 3J7, Canada.

出版信息

Syst Biol. 2013 Jul;62(4):611-5. doi: 10.1093/sysbio/syt022. Epub 2013 Apr 5.

Abstract

Modeling across site variation of the substitution process is increasingly recognized as important for obtaining more accurate phylogenetic reconstructions. Both finite and infinite mixture models have been proposed and have been shown to significantly improve on classical single-matrix models. Compared with their finite counterparts, infinite mixtures have a greater expressivity. However, they are computationally more challenging. This has resulted in practical compromises in the design of infinite mixture models. In particular, a fast but simplified version of a Dirichlet process model over equilibrium frequency profiles implemented in PhyloBayes has often been used in recent phylogenomics studies, while more refined model structures, more realistic and empirically more fit, have been practically out of reach. We introduce a message passing interface version of PhyloBayes, implementing the Dirichlet process mixture models as well as more classical empirical matrices and finite mixtures. The parallelization is made efficient thanks to the combination of two algorithmic strategies: a partial Gibbs sampling update of the tree topology and the use of a truncated stick-breaking representation for the Dirichlet process prior. The implementation shows close to linear gains in computational speed for up to 64 cores, thus allowing faster phylogenetic reconstruction under complex mixture models. PhyloBayes MPI is freely available from our website www.phylobayes.org.

摘要

跨站点替代过程变化的建模越来越被认为对于获得更准确的系统发育重建是重要的。有限和无限混合模型都已经被提出,并被证明可以显著改进经典的单矩阵模型。与它们的有限对应物相比,无限混合物具有更大的表现力。然而,它们在计算上更具挑战性。这导致了无限混合模型设计中的实际折衷。特别是,PhyloBayes 中实现的平衡频率分布上的 Dirichlet 过程模型的快速但简化版本在最近的系统基因组学研究中经常被使用,而更精细的模型结构,更现实和经验上更合适的模型结构,实际上是无法实现的。我们引入了 PhyloBayes 的消息传递接口版本,实现了 Dirichlet 过程混合模型以及更经典的经验矩阵和有限混合模型。通过结合两种算法策略,可以有效地实现并行化:树拓扑的部分 Gibbs 采样更新和 Dirichlet 过程先验的截断棒断裂表示的使用。该实现对于多达 64 个内核,在计算速度上接近线性增益,从而允许在复杂混合模型下更快地进行系统发育重建。PhyloBayes MPI 可从我们的网站 www.phylobayes.org 免费获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验