• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

将神经网络映射到基于FPGA的物联网设备以进行超低延迟处理。

Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing.

作者信息

Wielgosz Maciej, Karwatowski Michał

机构信息

Faculty of Computer Science, Electronics and Telecommunications, AGH University of Science and Technology, al. Adama Mickiewicza 30, 30-059 Cracow, Poland.

Academic Computer Centre CYFRONET AGH, ul. Nawojki 11, 30-072 Cracow, Poland.

出版信息

Sensors (Basel). 2019 Jul 5;19(13):2981. doi: 10.3390/s19132981.

DOI:10.3390/s19132981
PMID:31284516
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6651173/
Abstract

Internet of things (IoT) infrastructure, fast access to knowledge becomes critical. In some application domains, such as robotics, autonomous driving, predictive maintenance, and anomaly detection, the response time of the system is more critical to ensure Quality of Service than the quality of the answer. In this paper, we propose a methodology, a set of predefined steps to be taken in order to map the models to hardware, especially field programmable gate arrays (FPGAs), with the main focus on latency reduction. Multi-objective covariance matrix adaptation evolution strategy (MO-CMA-ES) was employed along with custom scores for sparsity, bit-width of the representation and quality of the model. Furthermore, we created a framework which enables mapping of neural models to FPGAs. The proposed solution is validated using three case studies and Xilinx Zynq UltraScale+ MPSoC 285 XCZU15EG as a platform. The results show a compression ratio for quantization and pruning in different scenarios with and without retraining procedures. Using our publicly available framework, we achieved 210 ns of latency for a single processing step for a model composed of two long short-term memory (LSTM) and a single dense layer.

摘要

对于物联网(IoT)基础设施而言,快速获取知识变得至关重要。在一些应用领域,如机器人技术、自动驾驶、预测性维护和异常检测中,系统的响应时间对于确保服务质量比答案的质量更为关键。在本文中,我们提出了一种方法,即一组为将模型映射到硬件(特别是现场可编程门阵列(FPGA))而要采取的预定义步骤,主要侧重于减少延迟。我们采用了多目标协方差矩阵自适应进化策略(MO-CMA-ES)以及针对稀疏性、表示的位宽和模型质量的自定义分数。此外,我们创建了一个能够将神经模型映射到FPGA的框架。所提出的解决方案通过三个案例研究以及以赛灵思Zynq UltraScale+ MPSoC 285 XCZU15EG作为平台进行了验证。结果显示了在有和没有重新训练过程的不同场景下量化和剪枝的压缩率。使用我们公开可用的框架,对于一个由两个长短期记忆(LSTM)和一个全连接层组成的模型,我们实现了单个处理步骤210纳秒的延迟。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/494cbb65bfe3/sensors-19-02981-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/dabad190ba6c/sensors-19-02981-g0A1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/82830be762f6/sensors-19-02981-g0A2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/f9a0cf647bc1/sensors-19-02981-g0A3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/7e22b6cae84a/sensors-19-02981-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/1da50bc4c29d/sensors-19-02981-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/25e57592ceed/sensors-19-02981-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/3080a00b5ac6/sensors-19-02981-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/a25687ac1cc4/sensors-19-02981-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/da1adb842c50/sensors-19-02981-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/c5b09f2910c7/sensors-19-02981-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/9bcf25ec88b1/sensors-19-02981-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/74df158f44f2/sensors-19-02981-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/55ab40a5d0c3/sensors-19-02981-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/60efac8aa5a8/sensors-19-02981-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/f864ece9b4a2/sensors-19-02981-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/494cbb65bfe3/sensors-19-02981-g013.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/dabad190ba6c/sensors-19-02981-g0A1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/82830be762f6/sensors-19-02981-g0A2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/f9a0cf647bc1/sensors-19-02981-g0A3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/7e22b6cae84a/sensors-19-02981-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/1da50bc4c29d/sensors-19-02981-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/25e57592ceed/sensors-19-02981-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/3080a00b5ac6/sensors-19-02981-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/a25687ac1cc4/sensors-19-02981-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/da1adb842c50/sensors-19-02981-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/c5b09f2910c7/sensors-19-02981-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/9bcf25ec88b1/sensors-19-02981-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/74df158f44f2/sensors-19-02981-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/55ab40a5d0c3/sensors-19-02981-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/60efac8aa5a8/sensors-19-02981-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/f864ece9b4a2/sensors-19-02981-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8181/6651173/494cbb65bfe3/sensors-19-02981-g013.jpg

相似文献

1
Mapping Neural Networks to FPGA-Based IoT Devices for Ultra-Low Latency Processing.将神经网络映射到基于FPGA的物联网设备以进行超低延迟处理。
Sensors (Basel). 2019 Jul 5;19(13):2981. doi: 10.3390/s19132981.
2
Pattern Classification Using Quantized Neural Networks for FPGA-Based Low-Power IoT Devices.基于量化神经网络的 FPGA 低功耗物联网设备的模式分类。
Sensors (Basel). 2022 Nov 10;22(22):8694. doi: 10.3390/s22228694.
3
Real-Time Inference With 2D Convolutional Neural Networks on Field Programmable Gate Arrays for High-Rate Particle Imaging Detectors.基于现场可编程门阵列的二维卷积神经网络对高速粒子成像探测器的实时推理
Front Artif Intell. 2022 May 18;5:855184. doi: 10.3389/frai.2022.855184. eCollection 2022.
4
Real-Time Energy Efficient Hand Pose Estimation: A Case Study.实时节能手姿估计:案例研究。
Sensors (Basel). 2020 May 16;20(10):2828. doi: 10.3390/s20102828.
5
FPGA-Based Hybrid-Type Implementation of Quantized Neural Networks for Remote Sensing Applications.基于 FPGA 的量化神经网络混合式实现及其在遥感中的应用。
Sensors (Basel). 2019 Feb 22;19(4):924. doi: 10.3390/s19040924.
6
A Post-training Quantization Method for the Design of Fixed-Point-Based FPGA/ASIC Hardware Accelerators for LSTM/GRU Algorithms.一种针对 LSTM/GRU 算法的基于定点的 FPGA/ASIC 硬件加速器设计的后训练量化方法。
Comput Intell Neurosci. 2022 May 11;2022:9485933. doi: 10.1155/2022/9485933. eCollection 2022.
7
A Novel Automate Python Edge-to-Edge: From Automated Generation on Cloud to User Application Deployment on Edge of Deep Neural Networks for Low Power IoT Systems FPGA-Based Acceleration.一种新型的自动化 Python 边缘到边缘:从云端的自动化生成到基于 FPGA 的低功耗物联网系统的边缘的用户应用部署,用于深度神经网络的加速。
Sensors (Basel). 2021 Sep 9;21(18):6050. doi: 10.3390/s21186050.
8
Real-time data analysis for medical diagnosis using FPGA-accelerated neural networks.使用 FPGA 加速神经网络进行医疗诊断的实时数据分析。
BMC Bioinformatics. 2018 Dec 21;19(Suppl 18):490. doi: 10.1186/s12859-018-2505-7.
9
-A Flexible Sensor Node Platform for the Internet of Things.-面向物联网的灵活传感器节点平台。
Sensors (Basel). 2021 Jul 29;21(15):5154. doi: 10.3390/s21155154.
10
Recurrent neural network FPGA hardware accelerator for delay-tolerant indoor optical wireless communications.用于容忍延迟的室内光无线通信的递归神经网络现场可编程门阵列硬件加速器。
Opt Express. 2021 Aug 2;29(16):26165-26182. doi: 10.1364/OE.427250.

引用本文的文献

1
Review of State-of-the-Art FPGA Applications in IoT Networks.物联网网络中前沿现场可编程门阵列应用综述。
Sensors (Basel). 2022 Oct 2;22(19):7496. doi: 10.3390/s22197496.
2
Algorithm and Distributed Computing for the Internet of Things.物联网的算法与分布式计算
Sensors (Basel). 2020 Aug 12;20(16):4513. doi: 10.3390/s20164513.

本文引用的文献

1
Protection of Superconducting Industrial Machinery Using RNN-Based Anomaly Detection for Implementation in Smart Sensor.基于 RNN 的异常检测在智能传感器中的应用对超导工业机械的保护。
Sensors (Basel). 2018 Nov 14;18(11):3933. doi: 10.3390/s18113933.
2
Covariance matrix adaptation for multi-objective optimization.用于多目标优化的协方差矩阵自适应
Evol Comput. 2007 Spring;15(1):1-28. doi: 10.1162/evco.2007.15.1.1.
3
Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES).
降低带协方差矩阵自适应的去随机化进化策略(CMA-ES)的时间复杂度。
Evol Comput. 2003 Spring;11(1):1-18. doi: 10.1162/106365603321828970.
4
Evolving neural networks through augmenting topologies.通过扩展拓扑结构来演化神经网络。
Evol Comput. 2002 Summer;10(2):99-127. doi: 10.1162/106365602320169811.