Farabet Clément, Paz Rafael, Pérez-Carrasco Jose, Zamarreño-Ramos Carlos, Linares-Barranco Alejandro, Lecun Yann, Culurciello Eugenio, Serrano-Gotarredona Teresa, Linares-Barranco Bernabe
Computer Science Department, Courant Institute of Mathematical Sciences, New York University New York, NY, USA.
Front Neurosci. 2012 Apr 10;6:32. doi: 10.3389/fnins.2012.00032. eCollection 2012.
Most scene segmentation and categorization architectures for the extraction of features in images and patches make exhaustive use of 2D convolution operations for template matching, template search, and denoising. Convolutional Neural Networks (ConvNets) are one example of such architectures that can implement general-purpose bio-inspired vision systems. In standard digital computers 2D convolutions are usually expensive in terms of resource consumption and impose severe limitations for efficient real-time applications. Nevertheless, neuro-cortex inspired solutions, like dedicated Frame-Based or Frame-Free Spiking ConvNet Convolution Processors, are advancing real-time visual processing. These two approaches share the neural inspiration, but each of them solves the problem in different ways. Frame-Based ConvNets process frame by frame video information in a very robust and fast way that requires to use and share the available hardware resources (such as: multipliers, adders). Hardware resources are fixed- and time-multiplexed by fetching data in and out. Thus memory bandwidth and size is important for good performance. On the other hand, spike-based convolution processors are a frame-free alternative that is able to perform convolution of a spike-based source of visual information with very low latency, which makes ideal for very high-speed applications. However, hardware resources need to be available all the time and cannot be time-multiplexed. Thus, hardware should be modular, reconfigurable, and expansible. Hardware implementations in both VLSI custom integrated circuits (digital and analog) and FPGA have been already used to demonstrate the performance of these systems. In this paper we present a comparison study of these two neuro-inspired solutions. A brief description of both systems is presented and also discussions about their differences, pros and cons.
大多数用于图像和图像块特征提取的场景分割与分类架构,在模板匹配、模板搜索和去噪过程中都充分利用了二维卷积操作。卷积神经网络(ConvNets)就是这类能够实现通用生物启发式视觉系统的架构之一。在标准数字计算机中,二维卷积在资源消耗方面通常代价高昂,并且对高效实时应用造成了严重限制。尽管如此,受神经皮层启发的解决方案,如专用的基于帧或无帧脉冲卷积网络卷积处理器,正在推动实时视觉处理的发展。这两种方法都有神经学启发,但它们以不同方式解决问题。基于帧的卷积网络以非常稳健且快速的方式逐帧处理视频信息,这需要使用并共享可用的硬件资源(如乘法器、加法器)。通过数据的输入和输出,硬件资源进行固定和时间复用。因此,内存带宽和大小对于良好性能很重要。另一方面,基于脉冲的卷积处理器是一种无帧替代方案,能够以非常低的延迟对基于脉冲的视觉信息源进行卷积,这使其非常适合超高速应用。然而,硬件资源需要始终可用,且不能进行时间复用。因此,硬件应该是模块化、可重新配置且可扩展的。超大规模集成电路定制集成电路(数字和模拟)以及现场可编程门阵列中的硬件实现都已被用于展示这些系统的性能。在本文中,我们对这两种受神经启发的解决方案进行了比较研究。介绍了这两种系统的简要描述,并讨论了它们的差异、优缺点。