Suppr超能文献

面向数据中心规模的FPGA加速器部署的编程与运行时支持。

Programming and Runtime Support to FPGA Accelerator Deployment at Datacenter Scale.

作者信息

Huang Muhuan, Wu Di, Yu Cody Hao, Fang Zhenman, Interlandi Matteo, Condie Tyson, Cong Jason

机构信息

University of California Los Angeles; Falcon Computing Solutions, Inc.

University of California Los Angeles.

出版信息

Proc ACM Symp Cloud Comput. 2016 Oct;2016:456-469. doi: 10.1145/2987550.2987569.

Abstract

With the end of CPU core scaling due to dark silicon limitations, customized accelerators on FPGAs have gained increased attention in modern datacenters due to their lower power, high performance and energy efficiency. Evidenced by Microsoft's FPGA deployment in its Bing search engine and Intel's 16.7 billion acquisition of Altera, integrating FPGAs into datacenters is considered one of the most promising approaches to sustain future datacenter growth. However, it is quite challenging for existing big data computing systems-like Apache Spark and Hadoop-to access the performance and energy benefits of FPGA accelerators. In this paper we design and implement Blaze to provide programming and runtime support for enabling easy and efficient deployments of FPGA accelerators in datacenters. In particular, Blaze abstracts FPGA accelerators as a service (FaaS) and provides a set of clean programming APIs for big data processing applications to easily utilize those accelerators. Our Blaze runtime implements an FaaS framework to efficiently share FPGA accelerators among multiple heterogeneous threads on a single node, and extends Hadoop YARN with accelerator-centric scheduling to efficiently share them among multiple computing tasks in the cluster. Experimental results using four representative big data applications demonstrate that Blaze greatly reduces the programming efforts to access FPGA accelerators in systems like Apache Spark and YARN, and improves the system throughput by 1.7 × to 3× (and energy efficiency by 1.5× to 2.7×) compared to a conventional CPU-only cluster.

摘要

由于暗硅限制导致CPU核心扩展的终结,基于现场可编程门阵列(FPGA)的定制加速器因其低功耗、高性能和能源效率,在现代数据中心受到了越来越多的关注。微软在其必应搜索引擎中部署FPGA以及英特尔以167亿美元收购阿尔特拉,都证明了将FPGA集成到数据中心被认为是维持未来数据中心增长最有前景的方法之一。然而,对于现有的大数据计算系统,如Apache Spark和Hadoop来说,要利用FPGA加速器的性能和能源优势颇具挑战。在本文中,我们设计并实现了Blaze,以提供编程和运行时支持,从而在数据中心轻松高效地部署FPGA加速器。具体而言,Blaze将FPGA加速器抽象为一种服务(FaaS),并为大数据处理应用程序提供了一组简洁的编程应用程序编程接口(API),以便轻松利用这些加速器。我们的Blaze运行时实现了一个FaaS框架,以在单个节点上的多个异构线程之间高效共享FPGA加速器,并通过以加速器为中心的调度扩展Hadoop YARN,以便在集群中的多个计算任务之间高效共享它们。使用四个具有代表性的大数据应用程序的实验结果表明,与传统的仅使用CPU的集群相比,Blaze大大减少了在Apache Spark和YARN等系统中访问FPGA加速器的编程工作量,并将系统吞吐量提高了1.7倍至3倍(能源效率提高了1.5倍至2.7倍)。

相似文献

3
Solving global shallow water equations on heterogeneous supercomputers.在异构超级计算机上求解全球浅水方程。
PLoS One. 2017 Mar 10;12(3):e0172583. doi: 10.1371/journal.pone.0172583. eCollection 2017.
6
Toward Full-Stack Acceleration of Deep Convolutional Neural Networks on FPGAs.深度卷积神经网络在 FPGAs 上的全栈加速。
IEEE Trans Neural Netw Learn Syst. 2022 Aug;33(8):3974-3987. doi: 10.1109/TNNLS.2021.3055240. Epub 2022 Aug 3.
8
Yin-Yang: Programming Abstractions for Cross-Domain Multi-Acceleration.阴阳:跨域多重加速的编程抽象
IEEE Micro. 2022 Sep-Oct;42(5):89-98. doi: 10.1109/mm.2022.3189416. Epub 2022 Aug 1.

本文引用的文献

1
Acceleration of EM-Based 3D CT Reconstruction Using FPGA.基于现场可编程门阵列(FPGA)的电磁(EM)三维CT重建加速技术
IEEE Trans Biomed Circuits Syst. 2016 Jun;10(3):754-67. doi: 10.1109/TBCAS.2015.2471813. Epub 2015 Oct 8.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验