• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
AutoBridge: Coupling Coarse-Grained Floorplanning and Pipelining for High-Frequency HLS Design on Multi-Die FPGAs.自动桥接:用于多芯片FPGA上高频HLS设计的粗粒度布局规划与流水线耦合
FPGA. 2021 Feb;2021:81-92. doi: 10.1145/3431920.3439289.
2
Extending High-Level Synthesis for Task-Parallel Programs.扩展任务并行程序的高级综合
Proc Annu IEEE Symp Field Program Cust Comput Mach. 2021 May;2021. doi: 10.1109/fccm51124.2021.00032. Epub 2021 Jun 2.
3
Hardware Acceleration of Digital Pulse Shape Analysis Using FPGAs.使用现场可编程门阵列(FPGA)对数字脉冲形状分析进行硬件加速
Sensors (Basel). 2024 Apr 25;24(9):2724. doi: 10.3390/s24092724.
4
Accelerating GRAPPA reconstruction using SoC design for real-time cardiac MRI.利用 SoC 设计加速 GRAPPA 重建,实现实时心脏 MRI。
Comput Biol Med. 2023 Jun;160:107008. doi: 10.1016/j.compbiomed.2023.107008. Epub 2023 May 4.
5
Research on Convolutional Neural Network Inference Acceleration and Performance Optimization for Edge Intelligence.面向边缘智能的卷积神经网络推理加速与性能优化研究
Sensors (Basel). 2023 Dec 31;24(1):240. doi: 10.3390/s24010240.
6
FPGA-based hardware accelerator for SENSE (a parallel MR image reconstruction method).基于现场可编程门阵列的灵敏度编码(一种并行磁共振图像重建方法)硬件加速器
Comput Biol Med. 2020 Feb;117:103598. doi: 10.1016/j.compbiomed.2019.103598. Epub 2020 Jan 3.
7
HBM Connect: High-Performance HLS Interconnect for FPGA HBM.HBM连接:用于FPGA HBM的高性能HLS互连
FPGA. 2021 Feb;2021:116-126. doi: 10.1145/3431920.3439301.
8
Accelerating SSSP for Power-Law Graphs.加速幂律图的单源最短路径问题
FPGA. 2022 Feb;2022:190-200. doi: 10.1145/3490422.3502358. Epub 2022 Feb 11.
9
A FPGA Implementation of JPEG Baseline Encoder for Wearable Devices.用于可穿戴设备的JPEG基线编码器的FPGA实现
Proc IEEE Annu Northeast Bioeng Conf. 2015 Apr;2015. doi: 10.1109/NEBEC.2015.7117173.
10
Distributed large-scale graph processing on FPGAs.基于现场可编程门阵列(FPGA)的分布式大规模图形处理
J Big Data. 2023;10(1):95. doi: 10.1186/s40537-023-00756-x. Epub 2023 Jun 4.

引用本文的文献

1
Accelerating SSSP for Power-Law Graphs.加速幂律图的单源最短路径问题
FPGA. 2022 Feb;2022:190-200. doi: 10.1145/3490422.3502358. Epub 2022 Feb 11.
2
A Soft Coprocessor Approach for Developing Image and Video Processing Applications on FPGAs.一种用于在现场可编程门阵列(FPGA)上开发图像和视频处理应用程序的软协处理器方法。
J Imaging. 2022 Feb 11;8(2):42. doi: 10.3390/jimaging8020042.

本文引用的文献

1
Extending High-Level Synthesis for Task-Parallel Programs.扩展任务并行程序的高级综合
Proc Annu IEEE Symp Field Program Cust Comput Mach. 2021 May;2021. doi: 10.1109/fccm51124.2021.00032. Epub 2021 Jun 2.
2
HBM Connect: High-Performance HLS Interconnect for FPGA HBM.HBM连接:用于FPGA HBM的高性能HLS互连
FPGA. 2021 Feb;2021:116-126. doi: 10.1145/3431920.3439301.
3
Minimap2: pairwise alignment for nucleotide sequences.Minimap2:核苷酸序列的两两比对。
Bioinformatics. 2018 Sep 15;34(18):3094-3100. doi: 10.1093/bioinformatics/bty191.
4
InkTag: Secure Applications on an Untrusted Operating System.InkTag:在不可信操作系统上的安全应用程序。
ASPLOS Proc. 2013:253-264. doi: 10.1145/2451116.2451146.

自动桥接:用于多芯片FPGA上高频HLS设计的粗粒度布局规划与流水线耦合

AutoBridge: Coupling Coarse-Grained Floorplanning and Pipelining for High-Frequency HLS Design on Multi-Die FPGAs.

作者信息

Guo Licheng, Chi Yuze, Wang Jie, Lau Jason, Qiao Weikang, Ustun Ecenur, Zhang Zhiru, Cong Jason

机构信息

University of California, Los Angeles.

Cornell University.

出版信息

FPGA. 2021 Feb;2021:81-92. doi: 10.1145/3431920.3439289.

DOI:10.1145/3431920.3439289
PMID:33851145
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8041363/
Abstract

Despite an increasing adoption of high-level synthesis (HLS) for its design productivity advantages, there remains a significant gap in the achievable frequency between an HLS design and a handcrafted RTL one. A key factor that limits the timing quality of the HLS outputs is the difficulty in accurately estimating the interconnect delay at the HLS level. This problem becomes even worse when large HLS designs are implemented on the latest multi-die FPGAs. To tackle this challenge, we propose AutoBridge, an automated framework that couples a coarse-grained floorplanning step with pipelining during HLS compilation. First, our approach provides HLS with a view on the global physical layout of the design, allowing HLS to more easily identify and pipeline the long wires, especially those crossing the die boundaries. Second, by exploiting the flexibility of HLS pipelining, the floorplanner is able to distribute the design logic across multiple dies on the FPGA device without degrading clock frequency. This prevents the placer from aggressively packing the logic on a single die which often results in local routing congestion that eventually degrades timing. Since pipelining may introduce additional latency, we further present analysis and algorithms to ensure the added latency will not compromise the overall throughput. AutoBridge can be integrated into the existing CAD toolflow for Xilinx FPGAs. In our experiments with a total of 43 design configurations, we improve the average frequency from 147 MHz to 297 MHz (a 102% improvement) with no loss of throughput and a negligible change in resource utilization. Notably, in 16 experiments we make the originally unroutable designs achieve 274 MHz on average. The tool is available at https://github.com/Licheng-Guo/AutoBridge.

摘要

尽管高级综合(HLS)因其在设计生产力方面的优势而被越来越多地采用,但HLS设计与手工编写的RTL设计在可实现的频率上仍存在显著差距。限制HLS输出时序质量的一个关键因素是在HLS级别准确估计互连延迟的难度。当在最新的多芯片FPGA上实现大型HLS设计时,这个问题会变得更加严重。为了应对这一挑战,我们提出了AutoBridge,这是一个自动化框架,在HLS编译期间将粗粒度布局规划步骤与流水线技术相结合。首先,我们的方法为HLS提供了设计的全局物理布局视图,使HLS能够更轻松地识别长连线并对其进行流水线处理,特别是那些跨越芯片边界的连线。其次,通过利用HLS流水线的灵活性,布局规划器能够将设计逻辑分布在FPGA器件的多个芯片上,而不会降低时钟频率。这可以防止布局器将逻辑过度堆积在单个芯片上,否则往往会导致局部布线拥塞,最终降低时序。由于流水线可能会引入额外的延迟,我们进一步提出了分析方法和算法,以确保增加的延迟不会影响整体吞吐量。AutoBridge可以集成到现有的用于赛灵思FPGA的CAD工具流程中。在我们总共43种设计配置的实验中,我们将平均频率从147 MHz提高到297 MHz(提高了102%),吞吐量没有损失,资源利用率的变化可以忽略不计。值得注意的是,在16次实验中,我们使原本无法布线的设计平均达到了274 MHz。该工具可在https://github.com/Licheng-Guo/AutoBridge上获取。