Suppr超能文献

关键动作:基于关键帧的深度学习网络三维动作识别高效方法。

Keys for Action: An Efficient Keyframe-Based Approach for 3D Action Recognition Using a Deep Neural Network.

机构信息

Department of Computer Science, National University of Computer and Emerging Sciences, Islamabad 44000, Pakistan.

Department of Computer Science II, Universität Bonn, 53115 Bonn, Germany.

出版信息

Sensors (Basel). 2020 Apr 15;20(8):2226. doi: 10.3390/s20082226.

Abstract

In this paper, we propose a novel and efficient framework for 3D action recognition using a deep learning architecture. First, we develop a 3D normalized pose space that consists of only 3D normalized poses, which are generated by discarding translation and orientation information. From these poses, we extract joint features and employ them further in a Deep Neural Network (DNN) in order to learn the action model. The architecture of our DNN consists of two hidden layers with the sigmoid activation function and an output layer with the softmax function. Furthermore, we propose a keyframe extraction methodology through which, from a motion sequence of 3D frames, we efficiently extract the keyframes that contribute substantially to the performance of the action. In this way, we eliminate redundant frames and reduce the length of the motion. More precisely, we ultimately summarize the motion sequence, while preserving the original motion semantics. We only consider the remaining essential informative frames in the process of action recognition, and the proposed pipeline is sufficiently fast and robust as a result. Finally, we evaluate our proposed framework intensively on publicly available benchmark Motion Capture (MoCap) datasets, namely HDM05 and CMU. From our experiments, we reveal that our proposed scheme significantly outperforms other state-of-the-art approaches.

摘要

在本文中,我们提出了一种新颖而有效的基于深度学习的 3D 动作识别框架。首先,我们开发了一个 3D 标准化姿态空间,它只包含 3D 标准化姿态,这些姿态通过丢弃平移和方向信息生成。从这些姿态中,我们提取关节特征,并进一步将其用于深度神经网络(DNN)中,以学习动作模型。我们的 DNN 架构由两个具有 sigmoid 激活函数的隐藏层和一个具有 softmax 函数的输出层组成。此外,我们提出了一种关键帧提取方法,通过该方法,从 3D 帧的运动序列中,我们可以有效地提取对动作性能有显著贡献的关键帧。这样,我们消除了冗余帧并缩短了运动的长度。更确切地说,我们最终总结了运动序列,同时保留了原始运动语义。在动作识别过程中,我们只考虑其余必要的信息帧,因此所提出的流水线速度快且稳健。最后,我们在公开的 Motion Capture(MoCap)数据集 HDM05 和 CMU 上对我们提出的框架进行了深入评估。从我们的实验中,我们发现我们提出的方案明显优于其他最先进的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e153/7218879/52244d28ed40/sensors-20-02226-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验