Electrical Department, Federal Institute of Education, Science and Technology of Sao Paulo - IFSP, Sao Paulo, Brazil.
Department of Electronic Systems Engineering, School of Engineering, University of Sao Paulo, Sao Paulo, Brazil.
Neural Netw. 2020 May;125:349-362. doi: 10.1016/j.neunet.2020.02.019. Epub 2020 Mar 3.
The design of neuromorphic chips aims to develop electronic circuits dedicated to executing artificial neural networks, mainly by exploring parallel processing. Unsupervised learning models, such as Self-organizing Maps (SOM), may benefit from massively concurrent hardware-based implementations to meet the requirements of real-time and embedded applications. This work first presents a theoretical analysis of the algorithms implemented in hardware to compute SOM learning and recall phases. This is important because, albeit similar, the processing steps executed in hardware are not necessarily identical to those executed in software. Then, the proposed FPGA architecture entitled SOMprocessor is shown in detail. The circuit of the processor explores two different computational strategies for increasing the performance of current state-of-the-art works. These computational strategies aim to improve the data flow through the processor and its flexibility to implement different network topologies. Finally, this work presents the application of the SOMprocessor to a video categorization task. The results show that topographic and quantization errors are similar between hardware and software implementations, as well as the overall accuracy. Moreover, the proposed FPGA architecture achieves acceleration of 3 to 4 orders of magnitude as compared to CPU executions.
神经形态芯片的设计旨在开发专门用于执行人工神经网络的电子电路,主要通过探索并行处理来实现。无监督学习模型,如自组织映射(SOM),可能受益于大规模并发的基于硬件的实现,以满足实时和嵌入式应用的要求。这项工作首先对硬件中实现的算法进行了理论分析,以计算 SOM 学习和回忆阶段。这很重要,因为尽管相似,但硬件中执行的处理步骤不一定与软件中执行的相同。然后,详细展示了名为 SOMprocessor 的提议的 FPGA 架构。处理器的电路探索了两种不同的计算策略,以提高现有最先进作品的性能。这些计算策略旨在提高处理器的数据流量及其实现不同网络拓扑的灵活性。最后,这项工作将 SOMprocessor 应用于视频分类任务。结果表明,硬件和软件实现之间的拓扑和量化误差以及整体准确性相似。此外,与 CPU 执行相比,所提出的 FPGA 架构实现了 3 到 4 个数量级的加速。