自动Spikformer：Spikformer架构搜索

Auto-Spikformer: Spikformer architecture search.

作者信息

Che Kaiwei, Zhou Zhaokun, Niu Jun, Ma Zhengyu, Fang Wei, Chen Yanqi, Shen Shuaijie, Yuan Li, Tian Yonghong

机构信息

School of Electronic and Computer Engineering, Shenzhen Graduate School, Peking University, Shenzhen, Guangdong, China.

Peng Cheng Laboratory, Shenzhen, Guangdong, China.

出版信息

Front Neurosci. 2024 Jul 23;18:1372257. doi: 10.3389/fnins.2024.1372257. eCollection 2024.

DOI:10.3389/fnins.2024.1372257

PMID:39108310

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11300351/

Abstract

INTRODUCTION

The integration of self-attention mechanisms into Spiking Neural Networks (SNNs) has garnered considerable interest in the realm of advanced deep learning, primarily due to their biological properties. Recent advancements in SNN architecture, such as Spikformer, have demonstrated promising outcomes. However, we observe that Spikformer may exhibit excessive energy consumption, potentially attributable to redundant channels and blocks.

METHODS

To mitigate this issue, we propose a one-shot Spiking Transformer Architecture Search method, namely Auto-Spikformer. Auto-Spikformer extends the search space to include both transformer architecture and SNN inner parameters. We train and search the supernet based on weight entanglement, evolutionary search, and the proposed Discrete Spiking Parameters Search (DSPS) methods. Benefiting from these methods, the performance of subnets with weights inherited from the supernet without even retraining is comparable to the original Spikformer. Moreover, we propose a new fitness function aiming to find a Pareto optimal combination balancing energy consumption and accuracy.

RESULTS AND DISCUSSION

Our experimental results demonstrate the effectiveness of Auto-Spikformer, which outperforms the original Spikformer and most CNN or ViT models with even fewer parameters and lower energy consumption.

摘要

引言

将自注意力机制集成到脉冲神经网络（SNN）中在先进的深度学习领域引起了相当大的关注，主要是由于它们的生物学特性。SNN架构的最新进展，如Spikformer，已显示出有前景的结果。然而，我们观察到Spikformer可能表现出过度的能量消耗，这可能归因于冗余的通道和模块。

方法

为了缓解这个问题，我们提出了一种一次性脉冲变压器架构搜索方法，即自动Spikformer。自动Spikformer扩展了搜索空间，以包括变压器架构和SNN内部参数。我们基于权重纠缠、进化搜索和提出的离散脉冲参数搜索（DSPS）方法来训练和搜索超网络。受益于这些方法，从超网络继承权重的子网甚至无需重新训练，其性能就与原始的Spikformer相当。此外，我们提出了一种新的适应度函数，旨在找到平衡能量消耗和准确性的帕累托最优组合。

结果与讨论

我们的实验结果证明了自动Spikformer的有效性，它在参数更少、能量消耗更低的情况下优于原始的Spikformer以及大多数卷积神经网络（CNN）或视觉Transformer（ViT）模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2164/11300351/0a511e983e67/fnins-18-1372257-g0001.jpg

相似文献

Auto-Spikformer: Spikformer architecture search.自动Spikformer：Spikformer架构搜索

Front Neurosci. 2024 Jul 23;18:1372257. doi: 10.3389/fnins.2024.1372257. eCollection 2024.

Efficient spiking neural network design via neural architecture search.通过神经结构搜索进行高效的尖峰神经网络设计。

Neural Netw. 2024 May;173:106172. doi: 10.1016/j.neunet.2024.106172. Epub 2024 Feb 16.

PSE-Net: Channel pruning for Convolutional Neural Networks with parallel-subnets estimator.PSE-Net：基于并行子网估计器的卷积神经网络通道剪枝

Neural Netw. 2024 Jun;174:106263. doi: 10.1016/j.neunet.2024.106263. Epub 2024 Mar 20.

Point-NAS: A Novel Neural Architecture Search Framework for Point Cloud Analysis.Point-NAS：一种用于点云分析的新型神经架构搜索框架。

IEEE Trans Image Process. 2023;32:6526-6542. doi: 10.1109/TIP.2023.3331223. Epub 2023 Dec 1.

One-Shot Neural Architecture Search by Dynamically Pruning Supernet in Hierarchical Order.分层动态剪枝超网的单步神经架构搜索。

Int J Neural Syst. 2021 Jul;31(7):2150029. doi: 10.1142/S0129065721500295. Epub 2021 Jun 14.

Sampling complex topology structures for spiking neural networks. Spike 神经网络的复杂拓扑结构采样。

Neural Netw. 2024 Apr;172:106121. doi: 10.1016/j.neunet.2024.106121. Epub 2024 Jan 10.

SSTDP: Supervised Spike Timing Dependent Plasticity for Efficient Spiking Neural Network Training.SSTDP：用于高效脉冲神经网络训练的监督式脉冲时间依赖可塑性

Front Neurosci. 2021 Nov 4;15:756876. doi: 10.3389/fnins.2021.756876. eCollection 2021.

MNGNAS: Distilling Adaptive Combination of Multiple Searched Networks for One-Shot Neural Architecture Search.MNGNAS：用于一次性神经架构搜索的多个搜索网络的自适应组合提取

IEEE Trans Pattern Anal Mach Intell. 2023 Nov;45(11):13489-13508. doi: 10.1109/TPAMI.2023.3293885. Epub 2023 Oct 3.

Optimizing Deeper Spiking Neural Networks for Dynamic Vision Sensing.深度尖峰神经网络在动态视觉传感中的优化。

Neural Netw. 2021 Dec;144:686-698. doi: 10.1016/j.neunet.2021.09.022. Epub 2021 Oct 5.

A Little Energy Goes a Long Way: Build an Energy-Efficient, Accurate Spiking Neural Network From Convolutional Neural Network.一点能量发挥大作用：从卷积神经网络构建节能且准确的脉冲神经网络。

Front Neurosci. 2022 May 26;16:759900. doi: 10.3389/fnins.2022.759900. eCollection 2022.

本文引用的文献

VOLO: Vision Outlooker for Visual Recognition.VOLO：用于视觉识别的视觉展望器

IEEE Trans Pattern Anal Mach Intell. 2023 May;45(5):6575-6586. doi: 10.1109/TPAMI.2022.3206108. Epub 2023 Apr 3.

Brains and algorithms partially converge in natural language processing.大脑和算法在自然语言处理中部分融合。

Commun Biol. 2022 Feb 16;5(1):134. doi: 10.1038/s42003-022-03036-1.

Spiking Deep Residual Networks.尖峰深度残差网络

IEEE Trans Neural Netw Learn Syst. 2023 Aug;34(8):5200-5205. doi: 10.1109/TNNLS.2021.3119238. Epub 2023 Aug 4.

Evolving interpretable plasticity for spiking networks.用于脉冲神经网络的不断发展的可解释可塑性。

Elife. 2021 Oct 28;10:e66273. doi: 10.7554/eLife.66273.

Optimizing Deeper Spiking Neural Networks for Dynamic Vision Sensing.深度尖峰神经网络在动态视觉传感中的优化。

Neural Netw. 2021 Dec;144:686-698. doi: 10.1016/j.neunet.2021.09.022. Epub 2021 Oct 5.

DIET-SNN: A Low-Latency Spiking Neural Network With Direct Input Encoding and Leakage and Threshold Optimization.DIET-SNN：一种具有直接输入编码以及泄漏和阈值优化的低延迟脉冲神经网络。

IEEE Trans Neural Netw Learn Syst. 2023 Jun;34(6):3174-3182. doi: 10.1109/TNNLS.2021.3111897. Epub 2023 Jun 1.

LIAF-Net: Leaky Integrate and Analog Fire Network for Lightweight and Efficient Spatiotemporal Information Processing.LIAF-Net：用于轻量级和高效时空信息处理的漏积分和模拟火灾网络。

IEEE Trans Neural Netw Learn Syst. 2022 Nov;33(11):6249-6262. doi: 10.1109/TNNLS.2021.3073016. Epub 2022 Oct 27.

Synaptic Plasticity Dynamics for Deep Continuous Local Learning (DECOLLE).深度连续局部学习（DECOLLE）的突触可塑性动力学

Front Neurosci. 2020 May 12;14:424. doi: 10.3389/fnins.2020.00424. eCollection 2020.

Efficient Processing of Spatio-Temporal Data Streams With Spiking Neural Networks.基于脉冲神经网络的时空数据流高效处理

Front Neurosci. 2020 May 5;14:439. doi: 10.3389/fnins.2020.00439. eCollection 2020.

Enabling Spike-Based Backpropagation for Training Deep Neural Network Architectures.实现基于尖峰的反向传播以训练深度神经网络架构。

Front Neurosci. 2020 Feb 28;14:119. doi: 10.3389/fnins.2020.00119. eCollection 2020.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

自动Spikformer：Spikformer架构搜索

Auto-Spikformer: Spikformer architecture search.

作者信息

机构信息

出版信息

INTRODUCTION

METHODS

RESULTS AND DISCUSSION

引言

方法

结果与讨论

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献