IEEE Trans Neural Netw Learn Syst. 2018 May;29(5):1427-1440. doi: 10.1109/TNNLS.2017.2669522. Epub 2017 Mar 8.
In this paper, we propose a bioinspired model for human action recognition through modeling neural mechanisms of information processing in two visual cortical areas: the primary visual cortex (V1) and the middle temporal cortex (MT) dedicated to motion. This model, named V1-MT, is composed of V1 and MT models (layers) corresponding to their cortical areas, which are built with layered spiking neural networks (SNNs). Some neuron properties in V1 and MT, such as direction and speed selectivity, spatiotemporal inseparability, and center surround suppression, are integrated into SNNs. Based on speed and direction selectivity, V1 and MT models contain multiple SNN channels, each of which processes motion information in sequences with spatiotemporal tunings of neurons at a certain speed and different directions. Therefore, we propose two operations, input signal perceiving with 3-D Gabor filters and surround inhibition processing with 3-D differences of Gaussian functions, to perform this task according to the spatiotemporal inseparability and center surround suppression of neurons. Then, neurons are modeled with our simplified integrate-and-fire model and motion information is transformed into spike trains. Afterward, we define a new feature vector: a mean motion map computed from spike trains in all channels to represent human actions. Finally, a support vector machine is trained to classify actions represented by the feature vectors. We conducted extensive experiments on public action databases, and the results show that our model outperforms other bioinspired models and rivals the state-of-the-art approaches.
在本文中,我们通过模拟两个视觉皮层区域(初级视觉皮层 V1 和专门用于运动的颞中皮层 MT)中的信息处理神经机制,提出了一种用于人类动作识别的仿生模型。该模型名为 V1-MT,由 V1 和 MT 模型(层)组成,对应于它们的皮质区域,这些模型是使用分层尖峰神经网络(SNN)构建的。V1 和 MT 中的一些神经元特性,如方向和速度选择性、时空不可分离性和中心-周围抑制,被整合到 SNN 中。基于速度和方向选择性,V1 和 MT 模型包含多个 SNN 通道,每个通道处理具有特定速度和不同方向的神经元时空调谐的序列中的运动信息。因此,我们提出了两种操作,即使用 3D Gabor 滤波器进行输入信号感知和使用 3D 高斯函数差分进行周围抑制处理,以根据神经元的时空不可分离性和中心-周围抑制来执行此任务。然后,使用我们简化的积分和点火模型对神经元进行建模,并将运动信息转换为尖峰序列。之后,我们定义了一个新的特征向量:从所有通道的尖峰序列中计算得到的平均运动图,用于表示人类动作。最后,使用支持向量机对表示特征向量的动作进行分类。我们在公共动作数据库上进行了广泛的实验,结果表明,我们的模型优于其他仿生模型,与最先进的方法相当。