Suppr超能文献

面向支持传感器内处理的高效 CNN 推理架构。

Towards an Efficient CNN Inference Architecture Enabling In-Sensor Processing.

机构信息

Electrical and Computer Engineering Department, University of Florida, Gainesville, FL 32603, USA.

出版信息

Sensors (Basel). 2021 Mar 10;21(6):1955. doi: 10.3390/s21061955.

Abstract

The astounding development of optical sensing imaging technology, coupled with the impressive improvements in machine learning algorithms, has increased our ability to understand and extract information from scenic events. In most cases, Convolution neural networks (CNNs) are largely adopted to infer knowledge due to their surprising success in automation, surveillance, and many other application domains. However, the convolution operations' overwhelming computation demand has somewhat limited their use in remote sensing edge devices. In these platforms, real-time processing remains a challenging task due to the tight constraints on resources and power. Here, the transfer and processing of non-relevant image pixels act as a bottleneck on the entire system. It is possible to overcome this bottleneck by exploiting the high bandwidth available at the sensor interface by designing a CNN inference architecture near the sensor. This paper presents an attention-based pixel processing architecture to facilitate the CNN inference near the image sensor. We propose an efficient computation method to reduce the dynamic power by decreasing the overall computation of the convolution operations. The proposed method reduces redundancies by using a hierarchical optimization approach. The approach minimizes power consumption for convolution operations by exploiting the Spatio-temporal redundancies found in the incoming feature maps and performs computations only on selected regions based on their relevance score. The proposed design addresses problems related to the mapping of computations onto an array of processing elements (PEs) and introduces a suitable network structure for communication. The PEs are highly optimized to provide low latency and power for CNN applications. While designing the model, we exploit the concepts of biological vision systems to reduce computation and energy. We prototype the model in a Virtex UltraScale+ FPGA and implement it in Application Specific Integrated Circuit (ASIC) using the TSMC 90nm technology library. The results suggest that the proposed architecture significantly reduces dynamic power consumption and achieves high-speed up surpassing existing embedded processors' computational capabilities.

摘要

光学传感成像技术的惊人发展,加上机器学习算法的显著改进,提高了我们从景观事件中理解和提取信息的能力。在大多数情况下,卷积神经网络 (CNN) 由于在自动化、监控和许多其他应用领域的惊人成功而被广泛用于推断知识。然而,卷积操作的巨大计算需求在一定程度上限制了它们在遥感边缘设备中的使用。在这些平台上,由于资源和电力的严格限制,实时处理仍然是一项具有挑战性的任务。在这里,非相关图像像素的传输和处理成为整个系统的瓶颈。通过在传感器接口处利用高带宽来设计靠近传感器的 CNN 推理架构,可以克服这个瓶颈。本文提出了一种基于注意力的像素处理架构,以促进靠近图像传感器的 CNN 推理。我们提出了一种有效的计算方法,通过减少卷积操作的整体计算来降低动态功率。该方法通过使用分层优化方法来减少冗余。该方法通过利用输入特征图中的时空冗余来最小化卷积操作的功耗,并仅根据相关性得分在选定的区域上执行计算。所提出的设计解决了与将计算映射到处理元素 (PE) 阵列相关的问题,并引入了一种适合通信的网络结构。PE 经过高度优化,可为 CNN 应用提供低延迟和低功耗。在设计模型时,我们利用生物视觉系统的概念来减少计算和能量。我们在 Virtex UltraScale+ FPGA 中对模型进行原型设计,并使用 TSMC 90nm 技术库在专用集成电路 (ASIC) 中实现它。结果表明,所提出的架构显著降低了动态功耗,并实现了高速超越现有嵌入式处理器的计算能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14d8/8001538/010ba28db0ac/sensors-21-01955-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验