Suppr超能文献

贝叶斯系统发育学中的自适应马尔可夫链蒙特卡罗方法:在BEAST中分析分区数据的应用

Adaptive MCMC in Bayesian phylogenetics: an application to analyzing partitioned data in BEAST.

作者信息

Baele Guy, Lemey Philippe, Rambaut Andrew, Suchard Marc A

机构信息

Department of Microbiology and Immunology, Rega Institute, KU Leuven, Leuven, Belgium.

Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK.

出版信息

Bioinformatics. 2017 Jun 15;33(12):1798-1805. doi: 10.1093/bioinformatics/btx088.

Abstract

MOTIVATION

Advances in sequencing technology continue to deliver increasingly large molecular sequence datasets that are often heavily partitioned in order to accurately model the underlying evolutionary processes. In phylogenetic analyses, partitioning strategies involve estimating conditionally independent models of molecular evolution for different genes and different positions within those genes, requiring a large number of evolutionary parameters that have to be estimated, leading to an increased computational burden for such analyses. The past two decades have also seen the rise of multi-core processors, both in the central processing unit (CPU) and Graphics processing unit processor markets, enabling massively parallel computations that are not yet fully exploited by many software packages for multipartite analyses.

RESULTS

We here propose a Markov chain Monte Carlo (MCMC) approach using an adaptive multivariate transition kernel to estimate in parallel a large number of parameters, split across partitioned data, by exploiting multi-core processing. Across several real-world examples, we demonstrate that our approach enables the estimation of these multipartite parameters more efficiently than standard approaches that typically use a mixture of univariate transition kernels. In one case, when estimating the relative rate parameter of the non-coding partition in a heterochronous dataset, MCMC integration efficiency improves by > 14-fold.

AVAILABILITY AND IMPLEMENTATION

Our implementation is part of the BEAST code base, a widely used open source software package to perform Bayesian phylogenetic inference.

CONTACT

guy.baele@kuleuven.be.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

测序技术的进步不断产生越来越大的分子序列数据集,为了准确模拟潜在的进化过程,这些数据集通常被严重划分。在系统发育分析中,划分策略涉及为不同基因以及这些基因内的不同位置估计分子进化的条件独立模型,这需要估计大量的进化参数,从而增加了此类分析的计算负担。在过去二十年中,无论是在中央处理器(CPU)还是图形处理器市场,多核处理器都有所兴起,这使得大规模并行计算成为可能,但许多用于多部分分析的软件包尚未充分利用这一点。

结果

我们在此提出一种马尔可夫链蒙特卡罗(MCMC)方法,该方法使用自适应多元转移核,通过利用多核处理来并行估计大量参数,这些参数分布在划分的数据中。通过几个实际例子,我们证明我们的方法比通常使用单变量转移核混合的标准方法更有效地估计这些多部分参数。在一个案例中,当估计异时数据集中非编码分区的相对速率参数时,MCMC积分效率提高了14倍以上。

可用性和实现

我们的实现是BEAST代码库的一部分,BEAST是一个广泛使用的用于执行贝叶斯系统发育推断的开源软件包。

联系方式

guy.baele@kuleuven.be

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

3
Bayesian phylogenetics with BEAUti and the BEAST 1.7.贝叶斯系统发育学与 BEAUTi 和 BEAST 1.7。
Mol Biol Evol. 2012 Aug;29(8):1969-73. doi: 10.1093/molbev/mss075. Epub 2012 Feb 25.
10
Scalable Bayesian phylogenetics.可扩展的贝叶斯系统发生学。
Philos Trans R Soc Lond B Biol Sci. 2022 Oct 10;377(1861):20210242. doi: 10.1098/rstb.2021.0242. Epub 2022 Aug 22.

引用本文的文献

5
Data integration in Bayesian phylogenetics.贝叶斯系统发育学中的数据整合。
Annu Rev Stat Appl. 2023;10:353-377. doi: 10.1146/annurev-statistics-033021-112532. Epub 2022 Sep 28.
8
HetMM: A Michaelis-Menten model for non-homogeneous enzyme mixtures.HetMM:一种用于非均相酶混合物的米氏模型。
iScience. 2024 Jan 19;27(2):108977. doi: 10.1016/j.isci.2024.108977. eCollection 2024 Feb 16.

本文引用的文献

3
Bayesian phylogenetics with BEAUti and the BEAST 1.7.贝叶斯系统发育学与 BEAUTi 和 BEAST 1.7。
Mol Biol Evol. 2012 Aug;29(8):1969-73. doi: 10.1093/molbev/mss075. Epub 2012 Feb 25.
8
Many-core algorithms for statistical phylogenetics.用于统计系统发育学的多核算法。
Bioinformatics. 2009 Jun 1;25(11):1370-6. doi: 10.1093/bioinformatics/btp244. Epub 2009 Apr 15.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验