Suppr超能文献

使用高度抽象的 OpenCL 框架探索 FPGA 平台的内存同步和性能考虑因素:基准测试的开发和分析。

Exploring memory synchronization and performance considerations for FPGA platform using the high-abstracted OpenCL framework: Benchmarks development and analysis.

机构信息

Department of Electrical & Computer Engineering, Gulf University for Science & Technology, Kuwait, Kuwait.

Department of Computer Engineering, Hijjawi Faculty for Engineering Technology, Yarmouk University, Irbid, Jordan.

出版信息

PLoS One. 2024 May 13;19(5):e0301720. doi: 10.1371/journal.pone.0301720. eCollection 2024.

Abstract

A key benefit of the Open Computing Language (OpenCL) software framework is its capability to operate across diverse architectures. Field programmable gate arrays (FPGAs) are a high-speed computing architecture used for computation acceleration. This study investigates the impact of memory access time on overall performance in general FPGA computing environments through the creation of eight benchmarks within the OpenCL framework. The developed benchmarks capture a range of memory access behaviors, and they play a crucial role in assessing the performance of spinning and sleeping on FPGA-based architectures. The results obtained guide the formulation of new implementations and contribute to defining an abstraction of FPGAs. This abstraction is then utilized to create tailored implementations of primitives that are well-suited for this platform. While other research endeavors concentrate on creating benchmarks with the Compute Unified Device Architecture (CUDA) to scrutinize the memory systems across diverse GPU architectures and propose recommendations for future generations of GPU computation platforms, this study delves into the memory system analysis for the broader FPGA computing platform. It achieves this by employing the highly abstracted OpenCL framework, exploring various data workload characteristics, and experimentally delineating the appropriate implementation of primitives that can seamlessly integrate into a design tailored for the FPGA computing platform. Additionally, the results underscore the efficacy of employing a task-parallel model to mitigate the need for high-cost synchronization mechanisms in designs constructed on general FPGA computing platforms.

摘要

Open Computing Language(OpenCL)软件框架的一个主要优势是它能够跨多种架构进行操作。现场可编程门阵列(FPGA)是一种用于计算加速的高速计算架构。本研究通过在 OpenCL 框架内创建八个基准测试,研究了在一般 FPGA 计算环境中内存访问时间对整体性能的影响。所开发的基准测试捕获了一系列内存访问行为,对于评估基于 FPGA 的架构上的旋转和睡眠性能至关重要。获得的结果指导了新实现的制定,并有助于定义 FPGA 的抽象。然后,利用该抽象为该平台创建了适合的基元定制实现。虽然其他研究工作侧重于使用 Compute Unified Device Architecture(CUDA)创建基准测试,以检查跨各种 GPU 架构的内存系统,并为未来几代 GPU 计算平台提出建议,但本研究深入研究了更广泛的 FPGA 计算平台的内存系统分析。它通过使用高度抽象的 OpenCL 框架、探索各种数据工作负载特征,并通过实验确定可以无缝集成到针对 FPGA 计算平台定制设计的基元的适当实现来实现这一目标。此外,研究结果强调了采用任务并行模型来减轻在一般 FPGA 计算平台上构建的设计中对高成本同步机制的需求的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4c8b/11090338/0f47f61bba75/pone.0301720.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验