Zou Chenglong, Cui Xiaoxin, Kuang Yisong, Liu Kefei, Wang Yuan, Wang Xinan, Huang Ru
Institute of Microelectronics, Peking University, Beijing, China.
School of ECE, Peking University Shenzhen Graduate School, Shenzhen, China.
Front Neurosci. 2021 Nov 16;15:694170. doi: 10.3389/fnins.2021.694170. eCollection 2021.
Artificial neural networks (ANNs), like convolutional neural networks (CNNs), have achieved the state-of-the-art results for many machine learning tasks. However, inference with large-scale full-precision CNNs must cause substantial energy consumption and memory occupation, which seriously hinders their deployment on mobile and embedded systems. Highly inspired from biological brain, spiking neural networks (SNNs) are emerging as new solutions because of natural superiority in brain-like learning and great energy efficiency with event-driven communication and computation. Nevertheless, training a deep SNN remains a main challenge and there is usually a big accuracy gap between ANNs and SNNs. In this paper, we introduce a hardware-friendly conversion algorithm called "scatter-and-gather" to convert quantized ANNs to lossless SNNs, where neurons are connected with ternary {-1,0,1} synaptic weights. Each spiking neuron is stateless and more like original McCulloch and Pitts model, because it fires at most one spike and need be reset at each time step. Furthermore, we develop an incremental mapping framework to demonstrate efficient network deployments on a reconfigurable neuromorphic chip. Experimental results show our spiking LeNet on MNIST and VGG-Net on CIFAR-10 datasetobtain 99.37% and 91.91% classification accuracy, respectively. Besides, the presented mapping algorithm manages network deployment on our neuromorphic chip with maximum resource efficiency and excellent flexibility. Our four-spike LeNet and VGG-Net on chip can achieve respective real-time inference speed of 0.38 ms/image, 3.24 ms/image, and an average power consumption of 0.28 mJ/image and 2.3 mJ/image at 0.9 V, 252 MHz, which is nearly two orders of magnitude more efficient than traditional GPUs.
人工神经网络(ANNs)与卷积神经网络(CNNs)一样,在许多机器学习任务中都取得了最先进的成果。然而,大规模全精度CNN的推理必然会导致大量的能量消耗和内存占用,这严重阻碍了它们在移动和嵌入式系统上的部署。受生物大脑的启发,脉冲神经网络(SNNs)作为新的解决方案正在兴起,因为它们在类脑学习方面具有天然优势,并且通过事件驱动的通信和计算具有很高的能源效率。尽管如此,训练深度SNN仍然是一个主要挑战,并且ANN和SNN之间通常存在很大的准确率差距。在本文中,我们介绍了一种名为“散射与聚集”的硬件友好型转换算法,用于将量化的ANN转换为无损SNN,其中神经元通过三元{-1,0,1}突触权重连接。每个脉冲神经元是无状态的,更类似于原始的麦卡洛克和皮茨模型,因为它最多发射一个脉冲,并且需要在每个时间步重置。此外,我们开发了一个增量映射框架,以展示在可重构神经形态芯片上的高效网络部署。实验结果表明,我们在MNIST数据集上的脉冲LeNet和在CIFAR-10数据集上的VGG-Net分别获得了99.37%和91.91%的分类准确率。此外,所提出的映射算法以最大的资源效率和出色的灵活性管理我们神经形态芯片上的网络部署。我们在芯片上的四脉冲LeNet和VGG-Net可以分别实现0.38 ms/图像、3.24 ms/图像的实时推理速度,在0.9 V、252 MHz时的平均功耗分别为0.28 mJ/图像和2.3 mJ/图像,这比传统GPU的效率高出近两个数量级。