Xie Jianyang, Meng Yanda, Zhao Yitian, Anh Nguyen, Yang Xiaoyun, Zheng Yalin
IEEE Trans Image Process. 2024 Nov 19;PP. doi: 10.1109/TIP.2024.3497837.
Human action recognition is an essential topic in computer vision and image processing. Graph convolutional networks (GCNs) have attracted significant attention and achieved noteworthy performance in skeleton-based human action recognition tasks. However, most of the previous graph-based works are designed to refine skeleton topology without considering the types of different joints and edges and the occurrence order of the frames. Such a limitation makes them insufficient to represent intrinsic semantic information. Differently, we proposed a dynamic semantic-based spatial-temporal graph convolution network (DS-STGCN) to address the challenge. DS-STGCN has two dynamic semantic modules for spatial and temporal contexts respectively. Specifically, the joints and edge types were encoded in the spatial module implicitly, and the occurrence order of frames was encoded in the temporal module implicitly. Extensive experiments on four datasets including NTU-RGB+D 60(120), Kinetics-400, and FineGYM show that our proposed two semantic modules can bring consistent recognition performance improvement with various backbones. Meanwhile, the proposed DS-STGCN notably surpassed state-of-the-art methods on these datasets. Notably, in the more challenging dataset, such as Kinetics-400, our model significantly outperformed other state-of-the-art GCN-based methods by a large margin. The code has been released at https://github.com/davelailai/DS-STGCN.
人体动作识别是计算机视觉和图像处理中的一个重要课题。图卷积网络(GCN)在基于骨架的人体动作识别任务中受到了广泛关注并取得了显著的性能。然而,以前大多数基于图的工作都是为了优化骨架拓扑结构,而没有考虑不同关节和边的类型以及帧的出现顺序。这种局限性使得它们不足以表示内在语义信息。不同的是,我们提出了一种基于动态语义的时空图卷积网络(DS-STGCN)来应对这一挑战。DS-STGCN分别有两个用于空间和时间上下文的动态语义模块。具体来说,关节和边的类型在空间模块中被隐式编码,帧的出现顺序在时间模块中被隐式编码。在包括NTU-RGB+D 60(120)、Kinetics-400和FineGYM在内的四个数据集上进行的大量实验表明,我们提出的两个语义模块可以在各种骨干网络上带来一致的识别性能提升。同时,所提出的DS-STGCN在这些数据集上显著超越了现有方法。值得注意的是,在更具挑战性的数据集(如Kinetics-400)中,我们的模型大幅显著优于其他基于GCN的现有方法。代码已在https://github.com/davelailai/DS-STGCN上发布。