Suppr超能文献

基于多流循环神经网络的长序列手指手势识别。

Finger Gesture Spotting from Long Sequences Based on Multi-Stream Recurrent Neural Networks.

机构信息

Toyota Technological Institute, Nagoya 468-8511, Japan.

DENSO CORPORATION, Kariya 448-8661, Japan.

出版信息

Sensors (Basel). 2020 Jan 18;20(2):528. doi: 10.3390/s20020528.

Abstract

Gesture spotting is an essential task for recognizing finger gestures used to control in-car touchless interfaces. Automated methods to achieve this task require to detect video segments where gestures are observed, to discard natural behaviors of users' hands that may look as target gestures, and be able to work online. In this paper, we address these challenges with a recurrent neural architecture for online finger gesture spotting. We propose a multi-stream network merging hand and hand-location features, which help to discriminate target gestures from natural movements of the hand, since these may not happen in the same 3D spatial location. Our multi-stream recurrent neural network (RNN) recurrently learns semantic information, allowing to spot gestures online in long untrimmed video sequences. In order to validate our method, we collect a finger gesture dataset in an in-vehicle scenario of an autonomous car. 226 videos with more than 2100 continuous instances were captured with a depth sensor. On this dataset, our gesture spotting approach outperforms state-of-the-art methods with an improvement of about 10% and 15% of recall and precision, respectively. Furthermore, we demonstrated that by combining with an existing gesture classifier (a 3D Convolutional Neural Network), our proposal achieves better performance than previous hand gesture recognition methods.

摘要

手势识别是识别用于控制车内无触摸界面的手指手势的基本任务。实现此任务的自动化方法需要检测到观察到手势的视频片段,从用户的手部可能看起来像目标手势的自然行为中进行区分,并能够在线工作。在本文中,我们使用用于在线手指手势识别的递归神经网络架构来解决这些挑战。我们提出了一种多流网络,合并了手部和手部位置特征,这有助于区分目标手势和手部的自然运动,因为这些手势可能不会发生在同一 3D 空间位置。我们的多流递归神经网络(RNN)递归地学习语义信息,允许在长的未修剪视频序列中在线识别手势。为了验证我们的方法,我们在自动驾驶汽车的车内场景中收集了一个手指手势数据集。使用深度传感器捕获了 226 个视频,其中包含超过 2100 个连续实例。在这个数据集上,我们的手势识别方法的召回率和准确率分别提高了约 10%和 15%,优于最先进的方法。此外,我们证明通过与现有的手势分类器(3D 卷积神经网络)相结合,我们的建议比以前的手势识别方法实现了更好的性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验