Liu Ying, Luo Xiaoling, Zhang Ya, Zhang Yun, Zhang Wei, Qu Hong
School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610054, PR China.
Key Laboratory of Higher Education of Sichuan Province for Enterprise Informationalization and Internet of Things, Sichuan University of Science and Engineering, Yibin 644000, PR China.
Neural Netw. 2025 Feb;182:106918. doi: 10.1016/j.neunet.2024.106918. Epub 2024 Nov 26.
Current vision-inspired spiking neural networks (SNNs) face key challenges due to their model structures typically focusing on single mechanisms and neglecting the integration of multiple biological features. These limitations, coupled with limited synaptic plasticity, hinder their ability to implement biologically realistic visual processing. To address these issues, we propose Spike-VisNet, a novel retina-inspired framework designed to enhance visual recognition capabilities. This framework simulates both the functional and layered structure of the retina. To further enhance this architecture, we integrate the FocusLayer-STDP learning rule, allowing Spike-VisNet to dynamically adjust synaptic weights in response to varying visual stimuli. This rule combines channel attention, inhibition mechanisms, and competitive mechanisms with spike-timing-dependent plasticity (STDP), significantly improving synaptic adaptability and visual recognition performance. Comprehensive evaluations on benchmark datasets demonstrate that Spike-VisNet outperforms other STDP-based SNNs, achieving precision scores of 98.6% on MNIST, 93.29% on ETH-80, and 86.27% on CIFAR-10. These results highlight its effectiveness and robustness, showcasing Spike-VisNet's potential to simulate human visual processing and its applicability to complex real-world visual challenges.
当前受视觉启发的脉冲神经网络(SNN)面临着关键挑战,因为其模型结构通常侧重于单一机制,而忽视了多种生物学特征的整合。这些局限性,再加上有限的突触可塑性,阻碍了它们实现生物学上逼真的视觉处理的能力。为了解决这些问题,我们提出了Spike-VisNet,这是一种新颖的受视网膜启发的框架,旨在增强视觉识别能力。该框架模拟了视网膜的功能和分层结构。为了进一步增强这种架构,我们集成了FocusLayer-STDP学习规则,使Spike-VisNet能够根据不同的视觉刺激动态调整突触权重。该规则将通道注意力、抑制机制和竞争机制与脉冲时间依赖可塑性(STDP)相结合,显著提高了突触适应性和视觉识别性能。在基准数据集上的综合评估表明,Spike-VisNet优于其他基于STDP的SNN,在MNIST上的精确率得分达到98.6%,在ETH-80上达到93.29%,在CIFAR-10上达到86.27%。这些结果突出了其有效性和鲁棒性,展示了Spike-VisNet模拟人类视觉处理的潜力及其对复杂现实世界视觉挑战的适用性。