Chen Guang, Chen Jieneng, Lienen Marten, Conradt Jörg, Röhrbein Florian, Knoll Alois C
College of Automotive Engineering, Tongji University, Shanghai, China.
Chair of Robotics, Artificial Intelligence and Real-time Systems, Technische Universität München, Munich, Germany.
Front Neurosci. 2019 Feb 12;13:73. doi: 10.3389/fnins.2019.00073. eCollection 2019.
A neuromorphic vision sensors is a novel passive sensing modality and frameless sensors with several advantages over conventional cameras. Frame-based cameras have an average frame-rate of 30 fps, causing motion blur when capturing fast motion, e.g., hand gesture. Rather than wastefully sending entire images at a fixed frame rate, neuromorphic vision sensors only transmit the local pixel-level changes induced by the movement in a scene when they occur. This leads to advantageous characteristics, including low energy consumption, high dynamic range, a sparse event stream and low response latency. In this study, a novel representation learning method was proposed: Fixed Length Gists Representation (FLGR) learning for event-based gesture recognition. Previous methods accumulate events into video frames in a time duration (e.g., 30 ms) to make the accumulated image-level representation. However, the accumulated-frame-based representation waives the friendly event-driven paradigm of neuromorphic vision sensor. New representation are urgently needed to fill the gap in non-accumulated-frame-based representation and exploit the further capabilities of neuromorphic vision. The proposed FLGR is a sequence learned from mixture density autoencoder and preserves the nature of event-based data better. FLGR has a data format of fixed length, and it is easy to feed to sequence classifier. Moreover, an RNN-HMM hybrid was proposed to address the continuous gesture recognition problem. Recurrent neural network (RNN) was applied for FLGR sequence classification while hidden Markov model (HMM) is employed for localizing the candidate gesture and improving the result in a continuous sequence. A neuromorphic continuous hand gestures dataset (Neuro ConGD Dataset) was developed with 17 hand gestures classes for the community of the neuromorphic research. Hopefully, FLGR can inspire the study on the event-based highly efficient, high-speed, and high-dynamic-range sequence classification tasks.
神经形态视觉传感器是一种新型的无源传感模式和无帧传感器,与传统相机相比具有多个优势。基于帧的相机平均帧率为30帧/秒,在捕捉快速动作(如手势)时会产生运动模糊。神经形态视觉传感器不会以固定帧率浪费地发送完整图像,而是仅在场景中的运动引起局部像素级变化发生时进行传输。这带来了包括低能耗、高动态范围、稀疏事件流和低响应延迟等优势特性。在本研究中,提出了一种新颖的表征学习方法:用于基于事件的手势识别的固定长度要点表征(FLGR)学习。先前的方法在一段时间(如30毫秒)内将事件累积到视频帧中,以生成累积的图像级表征。然而,基于累积帧的表征放弃了神经形态视觉传感器友好的事件驱动范式。迫切需要新的表征来填补基于非累积帧表征的空白,并挖掘神经形态视觉的进一步能力。所提出的FLGR是从混合密度自动编码器学习到的序列,能更好地保留基于事件的数据的本质。FLGR具有固定长度的数据格式,易于输入到序列分类器中。此外,还提出了一种循环神经网络(RNN)与隐马尔可夫模型(HMM)的混合模型来解决连续手势识别问题。循环神经网络用于FLGR序列分类,而隐马尔可夫模型用于定位候选手势并在连续序列中改进结果。为神经形态研究领域开发了一个包含17种手势类别的神经形态连续手势数据集(Neuro ConGD Dataset)。希望FLGR能激发对基于事件的高效、高速和高动态范围序列分类任务的研究。