IEEE Trans Neural Netw Learn Syst. 2024 Oct;35(10):14315-14329. doi: 10.1109/TNNLS.2023.3278265. Epub 2024 Oct 7.
Spiking neural networks (SNNs) mimic brain computational strategies, and exhibit substantial capabilities in spatiotemporal information processing. As an essential factor for human perception, visual attention refers to the dynamic process for selecting salient regions in biological vision systems. Although visual attention mechanisms have achieved great success in computer vision applications, they are rarely introduced into SNNs. Inspired by experimental observations on predictive attentional remapping, we propose a new spatial-channel-temporal-fused attention (SCTFA) module that can guide SNNs to efficiently capture underlying target regions by utilizing accumulated historical spatial-channel information in the present study. Through a systematic evaluation on three event stream datasets (DVS Gesture, SL-Animals-DVS, and MNIST-DVS), we demonstrate that the SNN with the SCTFA module (SCTFA-SNN) not only significantly outperforms the baseline SNN (BL-SNN) and two other SNN models with degenerated attention modules, but also achieves competitive accuracy with the existing state-of-the-art (SOTA) methods. Additionally, our detailed analysis shows that the proposed SCTFA-SNN model has strong robustness to noise and outstanding stability when faced with incomplete data, while maintaining acceptable complexity and efficiency. Overall, these findings indicate that incorporating appropriate cognitive mechanisms of the brain may provide a promising approach to elevate the capabilities of SNNs.
尖峰神经网络 (SNN) 模拟大脑的计算策略,在时空信息处理方面表现出强大的能力。视觉注意力作为人类感知的一个重要因素,是指生物视觉系统中选择显著区域的动态过程。尽管视觉注意机制在计算机视觉应用中取得了巨大成功,但它们很少被引入到 SNN 中。受关于预测性注意重映射的实验观察的启发,我们提出了一种新的空间-通道-时间融合注意力 (SCTFA) 模块,该模块可以利用当前研究中积累的历史空间-通道信息,引导 SNN 有效地捕获潜在的目标区域。通过对三个事件流数据集 (DVS Gesture、SL-Animals-DVS 和 MNIST-DVS) 的系统评估,我们证明了具有 SCTFA 模块的 SNN (SCTFA-SNN) 不仅显著优于基线 SNN (BL-SNN) 和另外两个具有退化注意力模块的 SNN 模型,而且与现有的最先进 (SOTA) 方法具有竞争力的准确性。此外,我们的详细分析表明,所提出的 SCTFA-SNN 模型对噪声具有很强的鲁棒性,在面对不完整的数据时具有出色的稳定性,同时保持可接受的复杂性和效率。总的来说,这些发现表明,结合大脑的适当认知机制可能为提高 SNN 的能力提供一种有前途的方法。