Suppr超能文献

基于学习方法的三维骨骼动作识别研究

A Survey on 3D Skeleton-Based Action Recognition Using Learning Method.

作者信息

Ren Bin, Liu Mengyuan, Ding Runwei, Liu Hong

机构信息

University of Pisa, Pisa, Italy.

University of Trento, Trento, Italy.

出版信息

Cyborg Bionic Syst. 2024 May 16;5:0100. doi: 10.34133/cbsystems.0100. eCollection 2024.

Abstract

Three-dimensional skeleton-based action recognition (3D SAR) has gained important attention within the computer vision community, owing to the inherent advantages offered by skeleton data. As a result, a plethora of impressive works, including those based on conventional handcrafted features and learned feature extraction methods, have been conducted over the years. However, prior surveys on action recognition have primarily focused on video or red-green-blue (RGB) data-dominated approaches, with limited coverage of reviews related to skeleton data. Furthermore, despite the extensive application of deep learning methods in this field, there has been a notable absence of research that provides an introductory or comprehensive review from the perspective of deep learning architectures. To address these limitations, this survey first underscores the importance of action recognition and emphasizes the significance of 3-dimensional (3D) skeleton data as a valuable modality. Subsequently, we provide a comprehensive introduction to mainstream action recognition techniques based on 4 fundamental deep architectures, i.e., recurrent neural networks, convolutional neural networks, graph convolutional network, and Transformers. All methods with the corresponding architectures are then presented in a data-driven manner with detailed discussion. Finally, we offer insights into the current largest 3D skeleton dataset, NTU-RGB+D, and its new edition, NTU-RGB+D 120, along with an overview of several top-performing algorithms on these datasets. To the best of our knowledge, this research represents the first comprehensive discussion of deep learning-based action recognition using 3D skeleton data.

摘要

基于三维骨骼的动作识别(3D SAR)因其骨骼数据所具有的固有优势而在计算机视觉领域受到了广泛关注。因此,多年来已经开展了大量令人印象深刻的工作,包括基于传统手工特征和学习特征提取方法的研究。然而,先前关于动作识别的综述主要集中在视频或红绿蓝(RGB)数据主导的方法上,对与骨骼数据相关的综述覆盖有限。此外,尽管深度学习方法在该领域得到了广泛应用,但从深度学习架构的角度进行入门或全面综述的研究却明显缺失。为了克服这些局限性,本综述首先强调了动作识别的重要性,并强调了三维(3D)骨骼数据作为一种有价值模态的重要性。随后,我们基于四种基本的深度架构,即循环神经网络、卷积神经网络、图卷积网络和Transformer,对主流动作识别技术进行了全面介绍。然后,所有具有相应架构的方法都以数据驱动的方式呈现,并进行了详细讨论。最后,我们深入探讨了当前最大的3D骨骼数据集NTU-RGB+D及其新版本NTU-RGB+D 120,以及这些数据集上几种表现最佳的算法概述。据我们所知,本研究首次对基于深度学习的3D骨骼数据动作识别进行了全面讨论。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d25/11096730/27ad4df443df/cbsystems.0100.fig.001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验