学习用于基于骨架的动作识别的异构时空上下文

Learning Heterogeneous Spatial-Temporal Context for Skeleton-Based Action Recognition.

作者信息

Gao Xuehao, Yang Yang, Wu Yang, Du Shaoyi

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Sep;35(9):12130-12141. doi: 10.1109/TNNLS.2023.3252172. Epub 2024 Sep 3.

DOI:10.1109/TNNLS.2023.3252172

Abstract

Graph convolution networks (GCNs) have been widely used and achieved fruitful progress in the skeleton-based action recognition task. In GCNs, node interaction modeling dominates the context aggregation and, therefore, is crucial for a graph-based convolution kernel to extract representative features. In this article, we introduce a closer look at a powerful graph convolution formulation to capture rich movement patterns from these skeleton-based graphs. Specifically, we propose a novel heterogeneous graph convolution (HetGCN) that can be considered as the middle ground between the extremes of (2 + 1)-D and 3-D graph convolution. The core observation of HetGCN is that multiple information flows are jointly intertwined in a 3-D convolution kernel, including spatial, temporal, and spatial-temporal cues. Since spatial and temporal information flows characterize different cues for action recognition, HetGCN first dynamically analyzes pairwise interactions between each node and its cross-space-time neighbors and then encourages heterogeneous context aggregation among them. Considering the HetGCN as a generic convolution formulation, we further develop it into two specific instantiations (i.e., intra-scale and inter-scale HetGCN) that significantly facilitate cross-space-time and cross-scale learning on skeleton graphs. By integrating these modules, we propose a strong human action recognition system that outperforms state-of-the-art methods with the accuracy of 93.1% on NTU-60 cross-subject (X-Sub) benchmark, 88.9% on NTU-120 X-Sub benchmark, and 38.4% on kinetics skeleton.

摘要

图卷积网络（GCN）已被广泛应用于基于骨架的动作识别任务，并取得了丰硕的成果。在GCN中，节点交互建模主导着上下文聚合，因此对于基于图的卷积核提取代表性特征至关重要。在本文中，我们深入研究了一种强大的图卷积公式，以从这些基于骨架的图中捕捉丰富的运动模式。具体来说，我们提出了一种新颖的异构图卷积（HetGCN），它可以被视为（2 + 1）-D和3-D图卷积这两个极端之间的中间地带。HetGCN的核心观察结果是，多个信息流在3-D卷积核中共同交织，包括空间、时间和时空线索。由于空间和时间信息流表征了动作识别的不同线索，HetGCN首先动态分析每个节点与其跨时空邻居之间的成对交互，然后促进它们之间的异构上下文聚合。将HetGCN视为一种通用的卷积公式，我们进一步将其发展为两个具体的实例（即尺度内和尺度间HetGCN），这显著促进了骨架图上的跨时空和跨尺度学习。通过集成这些模块，我们提出了一个强大的人体动作识别系统，在NTU-60跨主体（X-Sub）基准测试中以93.1%的准确率、在NTU-120 X-Sub基准测试中以88.9%的准确率以及在动力学骨架上以38.4%的准确率超越了现有方法。

相似文献

Learning Heterogeneous Spatial-Temporal Context for Skeleton-Based Action Recognition.学习用于基于骨架的动作识别的异构时空上下文

IEEE Trans Neural Netw Learn Syst. 2024 Sep;35(9):12130-12141. doi: 10.1109/TNNLS.2023.3252172. Epub 2024 Sep 3.

Multi-scale and attention enhanced graph convolution network for skeleton-based violence action recognition.用于基于骨架的暴力行为识别的多尺度注意力增强图卷积网络。

Front Neurorobot. 2022 Dec 15;16:1091361. doi: 10.3389/fnbot.2022.1091361. eCollection 2022.

Enhanced Spatial and Extended Temporal Graph Convolutional Network for Skeleton-Based Action Recognition.基于骨架的动作识别的增强时空图卷积网络。

Sensors (Basel). 2020 Sep 15;20(18):5260. doi: 10.3390/s20185260.

Graph Edge Convolutional Neural Networks for Skeleton-Based Action Recognition.基于骨架的动作识别的图边缘卷积神经网络。

IEEE Trans Neural Netw Learn Syst. 2020 Aug;31(8):3047-3060. doi: 10.1109/TNNLS.2019.2935173. Epub 2019 Sep 17.

Whole and Part Adaptive Fusion Graph Convolutional Networks for Skeleton-Based Action Recognition.用于基于骨架的动作识别的整体与部分自适应融合图卷积网络

Sensors (Basel). 2020 Dec 13;20(24):7149. doi: 10.3390/s20247149.

SelfGCN: Graph Convolution Network With Self-Attention for Skeleton-Based Action Recognition.SelfGCN：基于自注意力机制的用于基于骨架的动作识别的图卷积网络

IEEE Trans Image Process. 2024;33:4391-4403. doi: 10.1109/TIP.2024.3433581. Epub 2024 Aug 5.

Glimpse and focus: Global and local-scale graph convolution network for skeleton-based action recognition.瞥见与聚焦：用于基于骨架的动作识别的全局和局部尺度图卷积网络

Neural Netw. 2023 Oct;167:551-558. doi: 10.1016/j.neunet.2023.07.051. Epub 2023 Aug 22.

Skeleton-based Human Action Recognition via Large-kernel Attention Graph Convolutional Network.基于大内核注意力图卷积网络的骨骼人体动作识别

IEEE Trans Vis Comput Graph. 2023 May;29(5):2575-2585. doi: 10.1109/TVCG.2023.3247075. Epub 2023 Mar 29.

Symbiotic Graph Neural Networks for 3D Skeleton-Based Human Action Recognition and Motion Prediction.基于共生图神经网络的 3D 骨骼人类动作识别与运动预测。

IEEE Trans Pattern Anal Mach Intell. 2022 Jun;44(6):3316-3333. doi: 10.1109/TPAMI.2021.3053765. Epub 2022 May 5.

Graph Diffusion Convolutional Network for Skeleton Based Semantic Recognition of Two-Person Actions.基于骨架的两人动作语义识别的图扩散卷积网络。

IEEE Trans Pattern Anal Mach Intell. 2023 Jul;45(7):8477-8493. doi: 10.1109/TPAMI.2023.3238411. Epub 2023 Jun 5.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

学习用于基于骨架的动作识别的异构时空上下文

Learning Heterogeneous Spatial-Temporal Context for Skeleton-Based Action Recognition.

作者信息

出版信息

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献