Suppr超能文献

LDD:基于人工神经网络引导的深度脉冲神经网络变压器的高精度训练

LDD: High-Precision Training of Deep Spiking Neural Network Transformers Guided by an Artificial Neural Network.

作者信息

Liu Yuqian, Zhao Chujie, Jiang Yizhou, Fang Ying, Chen Feng

机构信息

Department of Automation, Tsinghua University, Beijing 100084, China.

LSBDPA Beijing Key Laboratory, Beijing 100084, China.

出版信息

Biomimetics (Basel). 2024 Jul 6;9(7):413. doi: 10.3390/biomimetics9070413.

Abstract

The rise of large-scale Transformers has led to challenges regarding computational costs and energy consumption. In this context, spiking neural networks (SNNs) offer potential solutions due to their energy efficiency and processing speed. However, the inaccuracy of surrogate gradients and feature space quantization pose challenges for directly training deep SNN Transformers. To tackle these challenges, we propose a method (called LDD) to align ANN and SNN features across different abstraction levels in a Transformer network. LDD incorporates structured feature knowledge from ANNs to guide SNN training, ensuring the preservation of crucial information and addressing inaccuracies in surrogate gradients through designing layer-wise distillation losses. The proposed approach outperforms existing methods on the CIFAR10 (96.1%), CIFAR100 (82.3%), and ImageNet (80.9%) datasets, and enables training of the deepest SNN Transformer network using ImageNet.

摘要

大规模Transformer的兴起带来了计算成本和能源消耗方面的挑战。在这种背景下,脉冲神经网络(SNN)因其能源效率和处理速度而提供了潜在的解决方案。然而,替代梯度的不准确性和特征空间量化给直接训练深度SNN Transformer带来了挑战。为了应对这些挑战,我们提出了一种方法(称为LDD),以在Transformer网络中跨不同抽象级别对齐人工神经网络(ANN)和SNN的特征。LDD整合了来自ANN的结构化特征知识来指导SNN训练,通过设计逐层蒸馏损失来确保关键信息的保留并解决替代梯度中的不准确性问题。所提出的方法在CIFAR10(96.1%)、CIFAR100(82.3%)和ImageNet(80.9%)数据集上优于现有方法,并能够使用ImageNet训练最深的SNN Transformer网络。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0bb/11274868/2bc6be726ec2/biomimetics-09-00413-g001.jpg

相似文献

1
2
A universal ANN-to-SNN framework for achieving high accuracy and low latency deep Spiking Neural Networks.
Neural Netw. 2024 Jun;174:106244. doi: 10.1016/j.neunet.2024.106244. Epub 2024 Mar 15.
3
IDSNN: Towards High-Performance and Low-Latency SNN Training via Initialization and Distillation.
Biomimetics (Basel). 2023 Aug 18;8(4):375. doi: 10.3390/biomimetics8040375.
5
Quantization Framework for Fast Spiking Neural Networks.
Front Neurosci. 2022 Jul 19;16:918793. doi: 10.3389/fnins.2022.918793. eCollection 2022.
6
Direct training high-performance deep spiking neural networks: a review of theories and methods.
Front Neurosci. 2024 Jul 31;18:1383844. doi: 10.3389/fnins.2024.1383844. eCollection 2024.
7
Training much deeper spiking neural networks with a small number of time-steps.
Neural Netw. 2022 Sep;153:254-268. doi: 10.1016/j.neunet.2022.06.001. Epub 2022 Jun 15.
8
Self-architectural knowledge distillation for spiking neural networks.
Neural Netw. 2024 Oct;178:106475. doi: 10.1016/j.neunet.2024.106475. Epub 2024 Jun 19.
9
Fast-SNN: Fast Spiking Neural Network by Converting Quantized ANN.
IEEE Trans Pattern Anal Mach Intell. 2023 Dec;45(12):14546-14562. doi: 10.1109/TPAMI.2023.3275769. Epub 2023 Nov 3.
10
An exact mapping from ReLU networks to spiking neural networks.
Neural Netw. 2023 Nov;168:74-88. doi: 10.1016/j.neunet.2023.09.011. Epub 2023 Sep 11.

引用本文的文献

1
Fully Interpretable Deep Learning Model Using IR Thermal Images for Possible Breast Cancer Cases.
Biomimetics (Basel). 2024 Oct 9;9(10):609. doi: 10.3390/biomimetics9100609.

本文引用的文献

1
Self-architectural knowledge distillation for spiking neural networks.
Neural Netw. 2024 Oct;178:106475. doi: 10.1016/j.neunet.2024.106475. Epub 2024 Jun 19.
2
IDSNN: Towards High-Performance and Low-Latency SNN Training via Initialization and Distillation.
Biomimetics (Basel). 2023 Aug 18;8(4):375. doi: 10.3390/biomimetics8040375.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验