Suppr超能文献

MrBayes tgMC³:MrBayes 的紧密 GPU 实现。

MrBayes tgMC³: a tight GPU implementation of MrBayes.

机构信息

Guangzhou Institute of Advanced Technology, Chinese Academy of Science, Guangzhou, China.

出版信息

PLoS One. 2013 Apr 9;8(4):e60667. doi: 10.1371/journal.pone.0060667. Print 2013.

Abstract

MrBayes is model-based phylogenetic inference tool using Bayesian statistics. However, model-based assessment of phylogenetic trees adds to the computational burden of tree-searching, and so poses significant computational challenges. Graphics Processing Units (GPUs) have been proposed as high performance, low cost acceleration platforms and several parallelized versions of the Metropolis Coupled Markov Chain Mote Carlo (MC(3)) algorithm in MrBayes have been presented that can run on GPUs. However, some bottlenecks decrease the efficiency of these implementations. To address these bottlenecks, we propose a tight GPU MC(3) (tgMC(3)) algorithm. tgMC(3) implements a different architecture from the one-to-one acceleration architecture employed in previously proposed methods. It merges multiply discrete GPU kernels according to the data dependency and hence decreases the number of kernels launched and the complexity of data transfer. We implemented tgMC(3) and made performance comparisons with an earlier proposed algorithm, nMC(3), and also with MrBayes MC(3) under serial and multiply concurrent CPU processes. All of the methods were benchmarked on the same computing node from DEGIMA. Experiments indicate that the tgMC(3) method outstrips nMC(3) (v1.0) with speedup factors from 2.1 to 2.7×. In addition, tgMC(3) outperforms the serial MrBayes MC(3) by a factor of 6 to 30× when using a single GTX480 card, whereas a speedup factor of around 51× can be achieved by using two GTX 480 cards on relatively long sequences. Moreover, tgMC(3) was compared with MrBayes accelerated by BEAGLE, and achieved speedup factors from 3.7 to 5.7×. The reported performance improvement of tgMC(3) is significant and appears to scale well with increasing dataset sizes. In addition, the strategy proposed in tgMC(3) could benefit the acceleration of other Bayesian-based phylogenetic analysis methods using GPUs.

摘要

MrBayes 是一种基于贝叶斯统计的系统发育推断工具。然而,基于模型的系统发育树评估增加了树搜索的计算负担,因此带来了重大的计算挑战。图形处理单元(GPU)已被提议作为高性能、低成本的加速平台,并且已经提出了几种可在 GPU 上运行的 MrBayes 中的 Metropolis 耦合马尔可夫链蒙特卡罗(MC(3))算法的并行化版本。然而,一些瓶颈降低了这些实现的效率。为了解决这些瓶颈,我们提出了一种紧密的 GPU MC(3)(tgMC(3))算法。tgMC(3)采用了与以前提出的方法中使用的一对一加速架构不同的架构。它根据数据依赖性合并多个离散的 GPU 内核,从而减少了内核的启动数量和数据传输的复杂性。我们实现了 tgMC(3),并与之前提出的算法 nMC(3)以及串行和多个并发 CPU 进程下的 MrBayes MC(3)进行了性能比较。所有方法都在 DEGIMA 的同一个计算节点上进行了基准测试。实验表明,tgMC(3)方法的加速比 nMC(3)(v1.0)高出 2.1 到 2.7 倍。此外,当使用单个 GTX480 卡时,tgMC(3)比串行 MrBayes MC(3)快 6 到 30 倍,而使用两个 GTX 480 卡时可以实现约 51 倍的加速比对于相对较长的序列。此外,tgMC(3)与使用 BEAGLE 加速的 MrBayes 进行了比较,并实现了 3.7 到 5.7 倍的加速比。tgMC(3)的报告性能改进是显著的,并且似乎随着数据集大小的增加而很好地扩展。此外,tgMC(3)中提出的策略可以为使用 GPU 加速其他基于贝叶斯的系统发育分析方法带来益处。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a04e/3621901/7a2ad33d738f/pone.0060667.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验