• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

将 BEAGLE 库扩展到多 FPGA 平台。

Extending the BEAGLE library to a multi-FPGA platform.

机构信息

Department of Computer Science and Engineering, University of South Carolina, Columbia, SC, USA.

出版信息

BMC Bioinformatics. 2013 Jan 19;14:25. doi: 10.1186/1471-2105-14-25.

DOI:10.1186/1471-2105-14-25
PMID:23331707
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3599256/
Abstract

BACKGROUND

Maximum Likelihood (ML)-based phylogenetic inference using Felsenstein's pruning algorithm is a standard method for estimating the evolutionary relationships amongst a set of species based on DNA sequence data, and is used in popular applications such as RAxML, PHYLIP, GARLI, BEAST, and MrBayes. The Phylogenetic Likelihood Function (PLF) and its associated scaling and normalization steps comprise the computational kernel for these tools. These computations are data intensive but contain fine grain parallelism that can be exploited by coprocessor architectures such as FPGAs and GPUs. A general purpose API called BEAGLE has recently been developed that includes optimized implementations of Felsenstein's pruning algorithm for various data parallel architectures. In this paper, we extend the BEAGLE API to a multiple Field Programmable Gate Array (FPGA)-based platform called the Convey HC-1.

RESULTS

The core calculation of our implementation, which includes both the phylogenetic likelihood function (PLF) and the tree likelihood calculation, has an arithmetic intensity of 130 floating-point operations per 64 bytes of I/O, or 2.03 ops/byte. Its performance can thus be calculated as a function of the host platform's peak memory bandwidth and the implementation's memory efficiency, as 2.03 × peak bandwidth × memory efficiency. Our FPGA-based platform has a peak bandwidth of 76.8 GB/s and our implementation achieves a memory efficiency of approximately 50%, which gives an average throughput of 78 Gflops. This represents a ~40X speedup when compared with BEAGLE's CPU implementation on a dual Xeon 5520 and 3X speedup versus BEAGLE's GPU implementation on a Tesla T10 GPU for very large data sizes. The power consumption is 92 W, yielding a power efficiency of 1.7 Gflops per Watt.

CONCLUSIONS

The use of data parallel architectures to achieve high performance for likelihood-based phylogenetic inference requires high memory bandwidth and a design methodology that emphasizes high memory efficiency. To achieve this objective, we integrated 32 pipelined processing elements (PEs) across four FPGAs. For the design of each PE, we developed a specialized synthesis tool to generate a floating-point pipeline with resource and throughput constraints to match the target platform. We have found that using low-latency floating-point operators can significantly reduce FPGA area and still meet timing requirement on the target platform. We found that this design methodology can achieve performance that exceeds that of a GPU-based coprocessor.

摘要

背景

基于最大似然法(ML)的系统发育推断使用费舍尔的修剪算法是一种基于 DNA 序列数据估计一组物种进化关系的标准方法,广泛应用于 RAxML、PHYLIP、GARLI、BEAST 和 MrBayes 等流行应用程序中。系统发育似然函数(PLF)及其相关的缩放和归一化步骤构成了这些工具的计算内核。这些计算是数据密集型的,但包含可以通过协处理器架构(如 FPGA 和 GPU)利用的细粒度并行性。最近开发了一种名为 BEAGLE 的通用 API,它为各种数据并行架构包含了费舍尔修剪算法的优化实现。在本文中,我们将 BEAGLE API 扩展到一个名为 Convey HC-1 的基于多现场可编程门阵列(FPGA)的平台。

结果

我们实现的核心计算,包括系统发育似然函数(PLF)和树似然计算,其算术强度为每 64 字节输入/输出 130 个浮点运算,即 2.03 个操作/字节。因此,其性能可以根据主机平台的峰值内存带宽和实现的内存效率来计算,即 2.03×峰值带宽×内存效率。我们的基于 FPGA 的平台的峰值带宽为 76.8GB/s,我们的实现实现了大约 50%的内存效率,这给出了 78Gflops 的平均吞吐量。与双 Xeon 5520 上的 BEAGLE 的 CPU 实现相比,这表示了约 40 倍的速度提升,与 Tesla T10 GPU 上的 BEAGLE 的 GPU 实现相比,速度提升了 3 倍,对于非常大的数据大小。功耗为 92W,效率为 1.7Gflops/W。

结论

使用数据并行架构实现基于似然的系统发育推断的高性能需要高内存带宽和强调高内存效率的设计方法。为了实现这一目标,我们在四个 FPGA 上集成了 32 个流水线处理元件(PE)。对于每个 PE 的设计,我们开发了一个专门的综合工具,生成一个具有资源和吞吐量约束的浮点流水线,以匹配目标平台。我们发现使用低延迟浮点运算符可以显著减少 FPGA 面积,同时满足目标平台的时序要求。我们发现这种设计方法可以实现超过基于 GPU 的协处理器的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d99f/3599256/93e905f17f00/1471-2105-14-25-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d99f/3599256/0c3552588432/1471-2105-14-25-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d99f/3599256/e3bec31d3cfd/1471-2105-14-25-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d99f/3599256/dff8483337ea/1471-2105-14-25-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d99f/3599256/cb7b9a689019/1471-2105-14-25-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d99f/3599256/731a4acab43f/1471-2105-14-25-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d99f/3599256/a845a5f630f9/1471-2105-14-25-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d99f/3599256/a3bdd944267c/1471-2105-14-25-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d99f/3599256/93e905f17f00/1471-2105-14-25-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d99f/3599256/0c3552588432/1471-2105-14-25-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d99f/3599256/e3bec31d3cfd/1471-2105-14-25-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d99f/3599256/dff8483337ea/1471-2105-14-25-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d99f/3599256/cb7b9a689019/1471-2105-14-25-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d99f/3599256/731a4acab43f/1471-2105-14-25-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d99f/3599256/a845a5f630f9/1471-2105-14-25-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d99f/3599256/a3bdd944267c/1471-2105-14-25-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d99f/3599256/93e905f17f00/1471-2105-14-25-8.jpg

相似文献

1
Extending the BEAGLE library to a multi-FPGA platform.将 BEAGLE 库扩展到多 FPGA 平台。
BMC Bioinformatics. 2013 Jan 19;14:25. doi: 10.1186/1471-2105-14-25.
2
FPGA Acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods.FPGA 加速贝叶斯 MCMC 推断方法的系统发育似然函数。
BMC Bioinformatics. 2010 Apr 12;11:184. doi: 10.1186/1471-2105-11-184.
3
Bayesian Phylogenetic Analysis on Multi-Core Compute Architectures: Implementation and Evaluation of BEAGLE in RevBayes With MPI.多核计算架构上的贝叶斯系统发育分析:MPI 下 RevBayes 中 BEAGLE 的实现与评估。
Syst Biol. 2024 Jul 27;73(2):455-469. doi: 10.1093/sysbio/syae005.
4
Many-core algorithms for high-dimensional gradients on phylogenetic trees.用于系统发育树上高维梯度的多核算法。
Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae030.
5
X-Ray Tomography Reconstruction Accelerated on FPGA Through High-Level Synthesis Tools.X 射线断层扫描重建在 FPGA 上通过高级综合工具加速。
IEEE Trans Biomed Circuits Syst. 2023 Apr;17(2):375-389. doi: 10.1109/TBCAS.2023.3258879. Epub 2023 May 10.
6
Runtime Programmable and Memory Bandwidth Optimized FPGA-Based Coprocessor for Deep Convolutional Neural Network.基于 FPGA 的可运行时编程和内存带宽优化的深度卷积神经网络协处理器。
IEEE Trans Neural Netw Learn Syst. 2018 Dec;29(12):5922-5934. doi: 10.1109/TNNLS.2018.2815085. Epub 2018 Apr 9.
7
BEAGLE: an application programming interface and high-performance computing library for statistical phylogenetics.BEAGLE:一个用于统计系统发生学的应用程序编程接口和高性能计算库。
Syst Biol. 2012 Jan;61(1):170-3. doi: 10.1093/sysbio/syr100. Epub 2011 Oct 1.
8
Distributed large-scale graph processing on FPGAs.基于现场可编程门阵列(FPGA)的分布式大规模图形处理
J Big Data. 2023;10(1):95. doi: 10.1186/s40537-023-00756-x. Epub 2023 Jun 4.
9
BEAGLE 3: Improved Performance, Scaling, and Usability for a High-Performance Computing Library for Statistical Phylogenetics.BEAGLE 3:为统计系统发生学的高性能计算库提供改进的性能、可扩展性和可用性。
Syst Biol. 2019 Nov 1;68(6):1052-1061. doi: 10.1093/sysbio/syz020.
10
Quantization-Aware NN Layers with High-throughput FPGA Implementation for Edge AI.具有高吞吐量 FPGA 实现的量化感知神经网络层,用于边缘人工智能。
Sensors (Basel). 2023 May 11;23(10):4667. doi: 10.3390/s23104667.

引用本文的文献

1
MRI-based brain tumor segmentation using FPGA-accelerated neural network.基于 MRI 的脑肿瘤分割的 FPGA 加速神经网络方法。
BMC Bioinformatics. 2021 Sep 7;22(1):421. doi: 10.1186/s12859-021-04347-6.

本文引用的文献

1
BEAGLE: an application programming interface and high-performance computing library for statistical phylogenetics.BEAGLE:一个用于统计系统发生学的应用程序编程接口和高性能计算库。
Syst Biol. 2012 Jan;61(1):170-3. doi: 10.1093/sysbio/syr100. Epub 2011 Oct 1.
2
MrBayes on a graphics processing unit.在图形处理单元上运行 MrBayes。
Bioinformatics. 2011 May 1;27(9):1255-61. doi: 10.1093/bioinformatics/btr140. Epub 2011 Mar 16.
3
FPGA Acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods.FPGA 加速贝叶斯 MCMC 推断方法的系统发育似然函数。
BMC Bioinformatics. 2010 Apr 12;11:184. doi: 10.1186/1471-2105-11-184.
4
Many-core algorithms for statistical phylogenetics.用于统计系统发育学的多核算法。
Bioinformatics. 2009 Jun 1;25(11):1370-6. doi: 10.1093/bioinformatics/btp244. Epub 2009 Apr 15.