HARP：面向视觉传感器中高性能计算的基于分层注意力的区域处理

HARP: Hierarchical Attention Oriented Region-Based Processing for High-Performance Computation in Vision Sensor.

作者信息

Bhowmik Pankaj, Pantho Md Jubaer Hossain, Bobda Christophe

机构信息

Electrical and Computer Engineering Department, University of Florida, Gainesville, FL 32603, USA.

出版信息

Sensors (Basel). 2021 Mar 4;21(5):1757. doi: 10.3390/s21051757.

DOI:10.3390/s21051757

PMID:33806329

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7961745/

Abstract

Cameras are widely adopted for high image quality with the rapid advancement of complementary metal-oxide-semiconductor (CMOS) image sensors while offloading vision applications' computation to the cloud. It raises concern for time-critical applications such as autonomous driving, surveillance, and defense systems since moving pixels from the sensor's focal plane are expensive. This paper presents a hardware architecture for smart cameras that understands the salient regions from an image frame and then performs high-level inference computation for sensor-level information creation instead of transporting raw pixels. A visual attention-oriented computational strategy helps to filter a significant amount of redundant spatiotemporal data collected at the focal plane. A computationally expensive learning model is then applied to the interesting regions of the image. The hierarchical processing in the pixels' data path demonstrates a bottom-up architecture with massive parallelism and gives high throughput by exploiting the large bandwidth available at the image source. We prototype the model in field-programmable gate array (FPGA) and application-specific integrated circuit (ASIC) for integrating with a pixel-parallel image sensor. The experiment results show that our approach achieves significant speedup while in certain conditions exhibits up to 45% more energy efficiency with the attention-oriented processing. Although there is an area overhead for inheriting attention-oriented processing, the achieved performance based on energy consumption, latency, and memory utilization overcomes that limitation.

摘要

随着互补金属氧化物半导体（CMOS）图像传感器的迅速发展，相机因其能提供高质量图像而被广泛采用，同时还能将视觉应用的计算任务卸载到云端。这引发了人们对自动驾驶、监控和国防系统等对时间要求苛刻的应用的担忧，因为从传感器焦平面移动像素成本高昂。本文提出了一种智能相机的硬件架构，该架构能够识别图像帧中的显著区域，然后执行高级推理计算以创建传感器级信息，而不是传输原始像素。一种面向视觉注意力的计算策略有助于过滤在焦平面收集的大量冗余时空数据。然后将计算成本高昂的学习模型应用于图像的感兴趣区域。像素数据路径中的分层处理展示了一种具有大规模并行性的自底向上架构，并通过利用图像源处可用的大带宽实现了高吞吐量。我们在现场可编程门阵列（FPGA）和专用集成电路（ASIC）中对该模型进行了原型设计，以便与像素并行图像传感器集成。实验结果表明，我们的方法实现了显著的加速，并且在某些条件下，通过面向注意力的处理，能效提高了多达45%。尽管继承面向注意力的处理会带来面积开销，但在能耗、延迟和内存利用率方面所实现的性能克服了这一限制。