• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

面向支持传感器内处理的高效 CNN 推理架构。

Towards an Efficient CNN Inference Architecture Enabling In-Sensor Processing.

机构信息

Electrical and Computer Engineering Department, University of Florida, Gainesville, FL 32603, USA.

出版信息

Sensors (Basel). 2021 Mar 10;21(6):1955. doi: 10.3390/s21061955.

DOI:10.3390/s21061955
PMID:33802235
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8001538/
Abstract

The astounding development of optical sensing imaging technology, coupled with the impressive improvements in machine learning algorithms, has increased our ability to understand and extract information from scenic events. In most cases, Convolution neural networks (CNNs) are largely adopted to infer knowledge due to their surprising success in automation, surveillance, and many other application domains. However, the convolution operations' overwhelming computation demand has somewhat limited their use in remote sensing edge devices. In these platforms, real-time processing remains a challenging task due to the tight constraints on resources and power. Here, the transfer and processing of non-relevant image pixels act as a bottleneck on the entire system. It is possible to overcome this bottleneck by exploiting the high bandwidth available at the sensor interface by designing a CNN inference architecture near the sensor. This paper presents an attention-based pixel processing architecture to facilitate the CNN inference near the image sensor. We propose an efficient computation method to reduce the dynamic power by decreasing the overall computation of the convolution operations. The proposed method reduces redundancies by using a hierarchical optimization approach. The approach minimizes power consumption for convolution operations by exploiting the Spatio-temporal redundancies found in the incoming feature maps and performs computations only on selected regions based on their relevance score. The proposed design addresses problems related to the mapping of computations onto an array of processing elements (PEs) and introduces a suitable network structure for communication. The PEs are highly optimized to provide low latency and power for CNN applications. While designing the model, we exploit the concepts of biological vision systems to reduce computation and energy. We prototype the model in a Virtex UltraScale+ FPGA and implement it in Application Specific Integrated Circuit (ASIC) using the TSMC 90nm technology library. The results suggest that the proposed architecture significantly reduces dynamic power consumption and achieves high-speed up surpassing existing embedded processors' computational capabilities.

摘要

光学传感成像技术的惊人发展,加上机器学习算法的显著改进,提高了我们从景观事件中理解和提取信息的能力。在大多数情况下,卷积神经网络 (CNN) 由于在自动化、监控和许多其他应用领域的惊人成功而被广泛用于推断知识。然而,卷积操作的巨大计算需求在一定程度上限制了它们在遥感边缘设备中的使用。在这些平台上,由于资源和电力的严格限制,实时处理仍然是一项具有挑战性的任务。在这里,非相关图像像素的传输和处理成为整个系统的瓶颈。通过在传感器接口处利用高带宽来设计靠近传感器的 CNN 推理架构,可以克服这个瓶颈。本文提出了一种基于注意力的像素处理架构,以促进靠近图像传感器的 CNN 推理。我们提出了一种有效的计算方法,通过减少卷积操作的整体计算来降低动态功率。该方法通过使用分层优化方法来减少冗余。该方法通过利用输入特征图中的时空冗余来最小化卷积操作的功耗,并仅根据相关性得分在选定的区域上执行计算。所提出的设计解决了与将计算映射到处理元素 (PE) 阵列相关的问题,并引入了一种适合通信的网络结构。PE 经过高度优化,可为 CNN 应用提供低延迟和低功耗。在设计模型时,我们利用生物视觉系统的概念来减少计算和能量。我们在 Virtex UltraScale+ FPGA 中对模型进行原型设计,并使用 TSMC 90nm 技术库在专用集成电路 (ASIC) 中实现它。结果表明,所提出的架构显著降低了动态功耗,并实现了高速超越现有嵌入式处理器的计算能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/a4d549361fe6/sensors-21-01955-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/010ba28db0ac/sensors-21-01955-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/8ac78d3bd2d3/sensors-21-01955-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/9856d12c565c/sensors-21-01955-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/c18e06b9f6f5/sensors-21-01955-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/0bc9487ffe49/sensors-21-01955-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/dcf2998f7f1f/sensors-21-01955-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/2d4ef6b48f2f/sensors-21-01955-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/8ad0a8d09682/sensors-21-01955-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/a3dac09eb2e3/sensors-21-01955-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/fe8e58452889/sensors-21-01955-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/84f428fcbd74/sensors-21-01955-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/a4d549361fe6/sensors-21-01955-g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/010ba28db0ac/sensors-21-01955-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/8ac78d3bd2d3/sensors-21-01955-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/9856d12c565c/sensors-21-01955-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/c18e06b9f6f5/sensors-21-01955-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/0bc9487ffe49/sensors-21-01955-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/dcf2998f7f1f/sensors-21-01955-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/2d4ef6b48f2f/sensors-21-01955-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/8ad0a8d09682/sensors-21-01955-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/a3dac09eb2e3/sensors-21-01955-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/fe8e58452889/sensors-21-01955-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/84f428fcbd74/sensors-21-01955-g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/a4d549361fe6/sensors-21-01955-g012.jpg

相似文献

1
Towards an Efficient CNN Inference Architecture Enabling In-Sensor Processing.面向支持传感器内处理的高效 CNN 推理架构。
Sensors (Basel). 2021 Mar 10;21(6):1955. doi: 10.3390/s21061955.
2
HARP: Hierarchical Attention Oriented Region-Based Processing for High-Performance Computation in Vision Sensor.HARP:面向视觉传感器中高性能计算的基于分层注意力的区域处理
Sensors (Basel). 2021 Mar 4;21(5):1757. doi: 10.3390/s21051757.
3
FPGA-based neural network accelerators for millimeter-wave radio-over-fiber systems.用于毫米波光纤无线系统的基于现场可编程门阵列的神经网络加速器
Opt Express. 2020 Apr 27;28(9):13384-13400. doi: 10.1364/OE.391050.
4
From Near-Sensor to In-Sensor: A State-of-the-Art Review of Embedded AI Vision Systems.从近传感器到传感器内:嵌入式人工智能视觉系统的最新综述
Sensors (Basel). 2024 Aug 22;24(16):5446. doi: 10.3390/s24165446.
5
A Heterogeneous Hardware Accelerator for Image Classification in Embedded Systems.面向嵌入式系统图像分类的异构硬件加速器。
Sensors (Basel). 2021 Apr 9;21(8):2637. doi: 10.3390/s21082637.
6
A Low-Power Hardware Architecture for Real-Time CNN Computing.用于实时卷积神经网络计算的低功耗硬件架构。
Sensors (Basel). 2023 Feb 11;23(4):2045. doi: 10.3390/s23042045.
7
A Cost-Efficient High-Speed VLSI Architecture for Spiking Convolutional Neural Network Inference Using Time-Step Binary Spike Maps.基于时间步长二值化 Spike 图的 Spike 卷积神经网络推理的高效高速 VLSI 架构
Sensors (Basel). 2021 Sep 8;21(18):6006. doi: 10.3390/s21186006.
8
Resources and Power Efficient FPGA Accelerators for Real-Time Image Classification.用于实时图像分类的资源与功耗高效FPGA加速器
J Imaging. 2022 Apr 15;8(4):114. doi: 10.3390/jimaging8040114.
9
NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps.零跳:一种基于特征图稀疏表示的灵活卷积神经网络加速器。
IEEE Trans Neural Netw Learn Syst. 2019 Mar;30(3):644-656. doi: 10.1109/TNNLS.2018.2852335. Epub 2018 Jul 26.
10
Quantization Friendly MobileNet (QF-MobileNet) Architecture for Vision Based Applications on Embedded Platforms.面向嵌入式平台视觉应用的量化友好型 MobileNet(QF-MobileNet)架构。
Neural Netw. 2021 Apr;136:28-39. doi: 10.1016/j.neunet.2020.12.022. Epub 2020 Dec 29.

引用本文的文献

1
Detection of Negative Stress through Spectral Features of Electroencephalographic Recordings and a Convolutional Neural Network.通过脑电图记录的光谱特征和卷积神经网络检测负性应激。
Sensors (Basel). 2021 Apr 27;21(9):3050. doi: 10.3390/s21093050.

本文引用的文献

1
Single-Image Visibility Restoration: A Machine Learning Approach and Its 4K-Capable Hardware Accelerator.单图像可见性恢复:一种机器学习方法及其具备4K能力的硬件加速器。
Sensors (Basel). 2020 Oct 13;20(20):5795. doi: 10.3390/s20205795.
2
Deep Learning for Computer Vision: A Brief Review.深度学习在计算机视觉中的应用综述
Comput Intell Neurosci. 2018 Feb 1;2018:7068349. doi: 10.1155/2018/7068349. eCollection 2018.
3
Predictive coding.预测编码。
Wiley Interdiscip Rev Cogn Sci. 2011 Sep;2(5):580-593. doi: 10.1002/wcs.142. Epub 2011 Mar 24.
4
Predicting human gaze beyond pixels.超越像素预测人类目光。
J Vis. 2014 Jan 28;14(1):28. doi: 10.1167/14.1.28.