School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China; Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China.
Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China.
Neural Netw. 2024 Dec;180:106630. doi: 10.1016/j.neunet.2024.106630. Epub 2024 Aug 20.
Spiking Neural Networks (SNNs) are naturally suited to process sequence tasks such as NLP with low power, due to its brain-inspired spatio-temporal dynamics and spike-driven nature. Current SNNs employ "repeat coding" that re-enter all input tokens at each timestep, which fails to fully exploit temporal relationships between the tokens and introduces memory overhead. In this work, we align the number of input tokens with the timestep and refer to this input coding as "individual coding". To cope with the increase in training time for individual encoded SNNs due to the dramatic increase in timesteps, we design a Bidirectional Parallel Spiking Neuron (BPSN) with following features: First, BPSN supports spike parallel computing and effectively avoids the issue of uninterrupted firing; Second, BPSN excels in handling adaptive sequence length tasks, which is a capability that existing work does not have; Third, the fusion of bidirectional information enhances the temporal information modeling capabilities of SNNs; To validate the effectiveness of our BPSN, we present the SNN-BERT, a deep direct training SNN architecture based on the BERT model in NLP. Compared to prior repeat 4-timestep coding baseline, our method achieves a 6.46× reduction in energy consumption and a significant 16.1% improvement, raising the performance upper bound of the SNN domain on the GLUE dataset to 74.4%. Additionally, our method achieves 3.5× training acceleration and 3.8× training memory optimization. Compared with artificial neural networks of similar architecture, we obtain comparable performance but up to 22.5× energy efficiency. We would provide the codes.
尖峰神经网络 (SNN) 因其受大脑启发的时空动力学和尖峰驱动特性,非常适合处理序列任务,例如具有低功耗的自然语言处理。目前的 SNN 采用“重复编码”,即在每个时间步重新输入所有输入令牌,这未能充分利用令牌之间的时间关系,并引入了内存开销。在这项工作中,我们将输入令牌的数量与时间步对齐,并将这种输入编码称为“个体编码”。为了应对由于时间步的急剧增加而导致个体编码 SNN 训练时间增加的问题,我们设计了一种具有以下特征的双向并行尖峰神经元 (BPSN):首先,BPSN 支持尖峰并行计算,有效地避免了连续发射的问题;其次,BPSN 擅长处理自适应序列长度任务,这是现有工作所不具备的能力;第三,双向信息的融合增强了 SNN 的时间信息建模能力。为了验证我们的 BPSN 的有效性,我们提出了 SNN-BERT,这是一种基于自然语言处理中 BERT 模型的深度直接训练 SNN 架构。与之前的重复 4 时间步编码基线相比,我们的方法将能量消耗降低了 6.46 倍,性能提高了 16.1%,将 SNN 领域在 GLUE 数据集上的性能上限提高到 74.4%。此外,我们的方法实现了 3.5 倍的训练加速和 3.8 倍的训练内存优化。与类似架构的人工神经网络相比,我们获得了相当的性能,但能源效率提高了 22.5 倍。我们将提供代码。