用于深度卷积神经网络的视图不变动作识别的3D骨骼运动的时空图像表示

Spatio⁻Temporal Image Representation of 3D Skeletal Movements for View-Invariant Action Recognition with Deep Convolutional Neural Networks.

作者信息

Pham Huy Hieu, Salmane Houssam, Khoudour Louahdi, Crouzil Alain, Zegers Pablo, Velastin Sergio A

机构信息

Cerema, Project team STI, 1 avenue du Colonel Roche, F-31400 Toulouse, France.

Informatics Research Institute of Toulouse (IRIT), Paul Sabatier University, Toulouse 31062, France.

出版信息

Sensors (Basel). 2019 Apr 24;19(8):1932. doi: 10.3390/s19081932.

DOI:10.3390/s19081932

PMID:31022945

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6514994/

Abstract

Designing motion representations for 3D human action recognition from skeleton sequences is an important yet challenging task. An effective representation should be robust to noise, invariant to viewpoint changes and result in a good performance with low-computational demand. Two main challenges in this task include how to efficiently represent spatio-temporal patterns of skeletal movements and how to learn their discriminative features for classification tasks. This paper presents a novel skeleton-based representation and a deep learning framework for 3D action recognition using RGB-D sensors. We propose to build an action map called SPMF (), which is a compact image representation built from skeleton poses and their motions. An Adaptive Histogram Equalization (AHE) algorithm is then applied on the SPMF to enhance their local patterns and form an enhanced action map, namely Enhanced-SPMF. For learning and classification tasks, we exploit Deep Convolutional Neural Networks based on the DenseNet architecture to learn directly an end-to-end mapping between input skeleton sequences and their action labels via the Enhanced-SPMFs. The proposed method is evaluated on four challenging benchmark datasets, including both individual actions, interactions, multiview and large-scale datasets. The experimental results demonstrate that the proposed method outperforms previous state-of-the-art approaches on all benchmark tasks, whilst requiring low computational time for training and inference.

摘要

从骨骼序列设计用于三维人体动作识别的运动表示是一项重要但具有挑战性的任务。一种有效的表示应该对噪声具有鲁棒性，对视角变化具有不变性，并且在低计算需求下能产生良好的性能。这项任务中的两个主要挑战包括如何有效地表示骨骼运动的时空模式，以及如何学习用于分类任务的判别特征。本文提出了一种新颖的基于骨骼的表示方法和一个使用RGB-D传感器进行三维动作识别的深度学习框架。我们建议构建一个名为SPMF（）的动作地图，它是一种从骨骼姿态及其运动构建的紧凑图像表示。然后将自适应直方图均衡化（AHE）算法应用于SPMF以增强其局部模式并形成增强动作地图，即增强型SPMF。对于学习和分类任务，我们利用基于DenseNet架构的深度卷积神经网络，通过增强型SPMF直接学习输入骨骼序列与其动作标签之间的端到端映射。所提出的方法在四个具有挑战性的基准数据集上进行了评估，包括个体动作、交互、多视图和大规模数据集。实验结果表明，所提出的方法在所有基准任务上均优于先前的最先进方法，同时在训练和推理时所需的计算时间较短。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9551/6514994/f793a9bd6188/sensors-19-01932-g0A1.jpg

相似文献

Spatio⁻Temporal Image Representation of 3D Skeletal Movements for View-Invariant Action Recognition with Deep Convolutional Neural Networks.用于深度卷积神经网络的视图不变动作识别的3D骨骼运动的时空图像表示

Sensors (Basel). 2019 Apr 24;19(8):1932. doi: 10.3390/s19081932.

Learning Clip Representations for Skeleton-Based 3D Action Recognition.学习基于骨架的 3D 动作识别的剪辑表示。

IEEE Trans Image Process. 2018 Jun;27(6):2842-2855. doi: 10.1109/TIP.2018.2812099.

Hierarchical Recurrent Neural Hashing for Image Retrieval With Hierarchical Convolutional Features.基于层次卷积特征的层次递归神经网络哈希图像检索

IEEE Trans Image Process. 2018;27(1):106-120. doi: 10.1109/TIP.2017.2755766.

Beyond Joints: Learning Representations From Primitive Geometries for Skeleton-Based Action Recognition and Detection.超越关节：基于骨架的动作识别和检测的从原始几何形状中学习表示。

IEEE Trans Image Process. 2018 Sep;27(9):4382-4394. doi: 10.1109/TIP.2018.2837386.

Representation Learning of Temporal Dynamics for Skeleton-Based Action Recognition.基于骨架的动作识别的时态动力学表示学习。

IEEE Trans Image Process. 2016 Jul;25(7):3010-3022. doi: 10.1109/TIP.2016.2552404. Epub 2016 Apr 8.

A Deep Sequence Learning Framework for Action Recognition in Small-Scale Depth Video Dataset.用于小规模深度视频数据集动作识别的深度序列学习框架。

Sensors (Basel). 2022 Sep 9;22(18):6841. doi: 10.3390/s22186841.

Using Direct Acyclic Graphs to Enhance Skeleton-Based Action Recognition with a Linear-Map Convolution Neural Network.基于有向无环图的线性映射卷积神经网络在骨骼动作识别中的应用。

Sensors (Basel). 2021 Apr 29;21(9):3112. doi: 10.3390/s21093112.

Body Joint Guided 3-D Deep Convolutional Descriptors for Action Recognition.基于体关节引导的三维深度卷积描述符的动作识别。

IEEE Trans Cybern. 2018 Mar;48(3):1095-1108. doi: 10.1109/TCYB.2017.2756840.

View Adaptive Neural Networks for High Performance Skeleton-Based Human Action Recognition.用于基于骨架的高性能人体动作识别的视图自适应神经网络。

IEEE Trans Pattern Anal Mach Intell. 2019 Aug;41(8):1963-1978. doi: 10.1109/TPAMI.2019.2896631. Epub 2019 Jan 31.

A Unified Deep Framework for Joint 3D Pose Estimation and Action Recognition from a Single RGB Camera.基于单目 RGB 相机的联合 3D 姿态估计和动作识别的统一深度框架。

Sensors (Basel). 2020 Mar 25;20(7):1825. doi: 10.3390/s20071825.

引用本文的文献

A Deep Sequence Learning Framework for Action Recognition in Small-Scale Depth Video Dataset.用于小规模深度视频数据集动作识别的深度序列学习框架。

Sensors (Basel). 2022 Sep 9;22(18):6841. doi: 10.3390/s22186841.

Skeleton Driven Action Recognition Using an Image-Based Spatial-Temporal Representation and Convolution Neural Network.基于图像的时空表示和卷积神经网络的骨骼驱动动作识别。

Sensors (Basel). 2021 Jun 25;21(13):4342. doi: 10.3390/s21134342.

Detection of sitting posture using hierarchical image composition and deep learning.使用分层图像合成和深度学习检测坐姿

PeerJ Comput Sci. 2021 Mar 23;7:e442. doi: 10.7717/peerj-cs.442. eCollection 2021.

Application of Machine Learning in Air Hockey Interactive Control System.机器学习在空气曲棍球互动控制系统中的应用。

Sensors (Basel). 2020 Dec 17;20(24):7233. doi: 10.3390/s20247233.

Prediction of Human Activities Based on a New Structure of Skeleton Features and Deep Learning Model.基于骨骼特征新结构和深度学习模型的人类活动预测。

Sensors (Basel). 2020 Sep 1;20(17):4944. doi: 10.3390/s20174944.

Keys for Action: An Efficient Keyframe-Based Approach for 3D Action Recognition Using a Deep Neural Network.关键动作：基于关键帧的深度学习网络三维动作识别高效方法。

Sensors (Basel). 2020 Apr 15;20(8):2226. doi: 10.3390/s20082226.

A Unified Deep Framework for Joint 3D Pose Estimation and Action Recognition from a Single RGB Camera.基于单目 RGB 相机的联合 3D 姿态估计和动作识别的统一深度框架。

Sensors (Basel). 2020 Mar 25;20(7):1825. doi: 10.3390/s20071825.

本文引用的文献

Skeleton-Based Human Action Recognition With Global Context-Aware Attention LSTM Networks.基于骨架的全局上下文感知注意力 LSTM 网络的人体动作识别。

IEEE Trans Image Process. 2018 Apr;27(4):1586-1599. doi: 10.1109/TIP.2017.2785279.

Super Normal Vector for Human Activity Recognition with Depth Cameras.基于深度相机的人体活动识别的超向量。

IEEE Trans Pattern Anal Mach Intell. 2017 May;39(5):1028-1039. doi: 10.1109/TPAMI.2016.2565479. Epub 2016 May 10.

Jointly Learning Heterogeneous Features for RGB-D Activity Recognition.联合学习 RGB-D 活动识别中的异构特征。

IEEE Trans Pattern Anal Mach Intell. 2017 Nov;39(11):2186-2200. doi: 10.1109/TPAMI.2016.2640292. Epub 2016 Dec 15.

Real-Time Simultaneous Pose and Shape Estimation for Articulated Objects Using a Single Depth Camera.使用单目深度相机实时估计关节物体的姿态和形状。

IEEE Trans Pattern Anal Mach Intell. 2016 Aug;38(8):1517-32. doi: 10.1109/TPAMI.2016.2557783. Epub 2016 Apr 21.

A Human Activity Recognition System Using Skeleton Data from RGBD Sensors.一种使用来自RGBD传感器的骨骼数据的人类活动识别系统。

Comput Intell Neurosci. 2016;2016:4351435. doi: 10.1155/2016/4351435. Epub 2016 Mar 16.

Learning Actionlet Ensemble for 3D Human Action Recognition.学习动作单元集以进行 3D 人体动作识别。

IEEE Trans Pattern Anal Mach Intell. 2014 May;36(5):914-27. doi: 10.1109/TPAMI.2013.198.

Deep learning.深度学习。

Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.

Recognizing human-object interactions in still images by modeling the mutual context of objects and human poses.通过建模物体和人体姿势的相互上下文来识别静态图像中的人与物体交互。

IEEE Trans Pattern Anal Mach Intell. 2012 Sep;34(9):1691-703. doi: 10.1109/TPAMI.2012.67.

Action and gait recognition from recovered 3-D human joints.从恢复的三维人体关节进行动作和步态识别。

IEEE Trans Syst Man Cybern B Cybern. 2010 Aug;40(4):1021-33. doi: 10.1109/TSMCB.2010.2043526. Epub 2010 Apr 12.

Observing human-object interactions: using spatial and functional compatibility for recognition.观察人与物体的交互：利用空间和功能兼容性进行识别。

IEEE Trans Pattern Anal Mach Intell. 2009 Oct;31(10):1775-89. doi: 10.1109/TPAMI.2009.83.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于深度卷积神经网络的视图不变动作识别的3D骨骼运动的时空图像表示

Spatio⁻Temporal Image Representation of 3D Skeletal Movements for View-Invariant Action Recognition with Deep Convolutional Neural Networks.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献