Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD 20742, USA.
Syst Biol. 2012 Jan;61(1):170-3. doi: 10.1093/sysbio/syr100. Epub 2011 Oct 1.
Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software.
系统发育推断对于我们理解生命起源和进化的大多数方面都至关重要,近年来,人们对贝叶斯推断和最大似然估计等统计方法的兴趣日益浓厚。然而,对于大型数据集和现实或有趣的进化模型,这些方法仍然需要大量的计算能力。高通量测序可以产生数千个分类群的数据,但使用串行计算扩展到这样的问题通常需要使用非统计或近似方法。图形处理单元(GPU)的最新出现提供了一个利用其出色的浮点计算性能来加速统计系统发育推断的机会。专门用于系统发育计算的库可以使现有软件包更有效地利用可用的计算机硬件,包括 GPU。采用通用库也将使其他新兴的计算架构(如现场可编程门阵列)更容易在未来使用。我们介绍了 BEAGLE,这是一个用于高性能统计系统发育推断的应用程序编程接口(API)和库。该 API 提供了在各种计算硬件平台上执行系统发育似然计算的统一接口。该库包含一组高效的实现,目前可以利用包括 NVIDIA CUDA 在内的硬件、具有流 SIMD 扩展和相关处理器补充指令集的中央处理器(CPU)以及通过 OpenMP 的多核 CPU。为了展示通用 API 的优势,我们已经将该库集成到几个流行的系统发育软件包中。BEAGLE 库是根据较小的 GPL 许可的免费开源软件,并可从 http://beagle-lib.googlecode.com 获得。一个示例客户端程序作为公共领域软件提供。