Suppr超能文献

基于骨架的两人动作语义识别的图扩散卷积网络。

Graph Diffusion Convolutional Network for Skeleton Based Semantic Recognition of Two-Person Actions.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2023 Jul;45(7):8477-8493. doi: 10.1109/TPAMI.2023.3238411. Epub 2023 Jun 5.

Abstract

Graph Convolutional Networks (GCNs) have successfully boosted skeleton-based human action recognition. However, existing GCN-based methods mostly cast the problem as separated person's action recognition while ignoring the interaction between the action initiator and the action responder, especially for the fundamental two-person interactive action recognition. It is still challenging to effectively take into account the intrinsic local-global clues of the two-person activity. Additionally, message passing in GCN depends on adjacency matrix, but skeleton-based human action recognition methods tend to calculate the adjacency matrix with the fixed natural skeleton connectivity. It means that messages can only travel along a fixed path at different layers of the network or in different actions, which greatly reduces the flexibility of the network. To this end, we propose a novel graph diffusion convolutional network for skeleton based semantic recognition of two-person actions by embedding the graph diffusion into GCNs. At technical fronts, we dynamically construct the adjacency matrix based on practical action information, so that we can guide the message propagation in a more meaningful way. Simultaneously, we introduce the frame importance calculation module to conduct dynamic convolution, so that we can avoid the negative effect caused by the traditional convolution, wherein the shared weights may fail to capture key frames or be affected by noisy frames. Besides, we comprehensively leverage the multidimensional features related to joints' local visual appearances, global spatial relationship and temporal coherency, and for different features, different metrics are designed to measure the similarity underlying the corresponding real physical law of the motions. Moreover, extensive experiments and comprehensive evaluations on four public large-scale datasets (NTU-RGB+D 60, NTU-RGB+D 120, Kinetics-Skeleton 400, and SBU-Interaction) demonstrate that our method outperforms the state-of-the-art methods.

摘要

图卷积网络 (GCN) 成功地提高了基于骨架的人体动作识别。然而,现有的基于 GCN 的方法大多将问题视为单独的人的动作识别,而忽略了动作发起者和动作响应者之间的相互作用,特别是对于基本的两人交互动作识别。有效地考虑两人活动的内在局部-全局线索仍然具有挑战性。此外,GCN 中的消息传递依赖于邻接矩阵,但基于骨架的人体动作识别方法倾向于使用固定的自然骨架连接来计算邻接矩阵。这意味着消息只能在网络的不同层或不同动作中沿着固定的路径传播,这大大降低了网络的灵活性。为此,我们提出了一种新的图扩散卷积网络,通过将图扩散嵌入到 GCN 中,对两人动作进行基于骨架的语义识别。在技术前沿,我们基于实际的动作信息动态构建邻接矩阵,以便以更有意义的方式指导消息传播。同时,我们引入了帧重要性计算模块进行动态卷积,从而避免了传统卷积带来的负面影响,其中共享权重可能无法捕获关键帧或受到噪声帧的影响。此外,我们综合利用了与关节局部视觉外观、全局空间关系和时间一致性相关的多维特征,对于不同的特征,设计了不同的度量标准来衡量相应运动的真实物理规律下的相似性。此外,在四个公共大规模数据集(NTU-RGB+D60、NTU-RGB+D120、Kinetics-Skeleton400 和 SBU-Interaction)上进行了广泛的实验和综合评估,证明了我们的方法优于最新方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验