Suppr超能文献

用于实时图像分类的资源与功耗高效FPGA加速器

Resources and Power Efficient FPGA Accelerators for Real-Time Image Classification.

作者信息

Kyriakos Angelos, Papatheofanous Elissaios-Alexios, Bezaitis Charalampos, Reisis Dionysios

机构信息

Electronics Laboratory, Faculty of Physics, National and Kapodistrian University of Athens, 15772 Athens, Greece.

出版信息

J Imaging. 2022 Apr 15;8(4):114. doi: 10.3390/jimaging8040114.

Abstract

A plethora of image and video-related applications involve complex processes that impose the need for hardware accelerators to achieve real-time performance. Among these, notable applications include the Machine Learning (ML) tasks using Convolutional Neural Networks (CNNs) that detect objects in image frames. Aiming at contributing to the CNN accelerator solutions, the current paper focuses on the design of Field-Programmable Gate Arrays (FPGAs) for CNNs of limited feature space to improve performance, power consumption and resource utilization. The proposed design approach targets the designs that can utilize the logic and memory resources of a single FPGA device and benefit mainly the edge, mobile and on-board satellite (OBC) computing; especially their image-processing- related applications. This work exploits the proposed approach to develop an FPGA accelerator for vessel detection on a Xilinx Virtex 7 XC7VX485T FPGA device (Advanced Micro Devices, Inc, Santa Clara, CA, USA). The resulting architecture operates on RGB images of size 80×80 or sliding windows; it is trained for the "Ships in Satellite Imagery" and by achieving frequency 270 MHz, completing the inference in 0.687 ms and consuming 5 watts, it validates the approach.

摘要

大量与图像和视频相关的应用涉及复杂的过程,这就需要硬件加速器来实现实时性能。其中,值得注意的应用包括使用卷积神经网络(CNN)在图像帧中检测物体的机器学习(ML)任务。为了对CNN加速器解决方案做出贡献,本文重点关注针对有限特征空间的CNN设计现场可编程门阵列(FPGA),以提高性能、功耗和资源利用率。所提出的设计方法针对的是能够利用单个FPGA设备的逻辑和存储资源的设计,主要惠及边缘、移动和机载卫星(OBC)计算;尤其是它们与图像处理相关的应用。这项工作利用所提出的方法在赛灵思Virtex 7 XC7VX485T FPGA设备(美国加利福尼亚州圣克拉拉市高级微设备公司)上开发了一种用于船只检测的FPGA加速器。最终的架构对大小为80×80的RGB图像或滑动窗口进行操作;它针对“卫星图像中的船只”进行了训练,通过达到270 MHz的频率,在0.687毫秒内完成推理,功耗为5瓦,验证了该方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/478b/9032259/d1b38dad93c7/jimaging-08-00114-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验