Zhan Qiugang, Liu Guisong, Xie Xiurui, Tao Ran, Zhang Malu, Tang Huajin
IEEE Trans Image Process. 2024;33:4274-4287. doi: 10.1109/TIP.2024.3430043. Epub 2024 Jul 30.
Recent advances in bio-inspired vision with event cameras and associated spiking neural networks (SNNs) have provided promising solutions for low-power consumption neuromorphic tasks. However, as the research of event cameras is still in its infancy, the amount of labeled event stream data is much less than that of the RGB database. The traditional method of converting static images into event streams by simulation to increase the sample size cannot simulate the characteristics of event cameras such as high temporal resolution. To take advantage of both the rich knowledge in labeled RGB images and the features of the event camera, we propose a transfer learning method from the RGB to the event domain in this paper. Specifically, we first introduce a transfer learning framework named R2ETL (RGB to Event Transfer Learning), including a novel encoding alignment module and a feature alignment module. Then, we introduce the temporal centered kernel alignment (TCKA) loss function to improve the efficiency of transfer learning. It aligns the distribution of temporal neuron states by adding a temporal learning constraint. Finally, we theoretically analyze the amount of data required by the deep neuromorphic model to prove the necessity of our method. Numerous experiments demonstrate that our proposed framework outperforms the state-of-the-art SNN and artificial neural network (ANN) models trained on event streams, including N-MNIST, CIFAR10-DVS and N-Caltech101. This indicates that the R2ETL framework is able to leverage the knowledge of labeled RGB images to help the training of SNN on event streams.
近期,受生物启发的视觉技术在事件相机及相关脉冲神经网络(SNN)方面取得的进展,为低功耗神经形态任务提供了颇具前景的解决方案。然而,由于事件相机的研究仍处于起步阶段,带标签的事件流数据量远少于RGB数据库。通过模拟将静态图像转换为事件流以增加样本量的传统方法,无法模拟事件相机的高时间分辨率等特性。为了利用带标签的RGB图像中的丰富知识以及事件相机的特性,我们在本文中提出了一种从RGB到事件域的迁移学习方法。具体而言,我们首先引入了一个名为R2ETL(RGB到事件迁移学习)的迁移学习框架,包括一个新颖的编码对齐模块和一个特征对齐模块。然后,我们引入了时间中心核对齐(TCKA)损失函数来提高迁移学习的效率。它通过添加时间学习约束来对齐时间神经元状态的分布。最后,我们从理论上分析了深度神经形态模型所需的数据量,以证明我们方法的必要性。大量实验表明,我们提出的框架优于在事件流上训练的现有SNN和人工神经网络(ANN)模型,包括N-MNIST、CIFAR10-DVS和N-Caltech101。这表明R2ETL框架能够利用带标签的RGB图像的知识来帮助SNN在事件流上进行训练。