Zhuang Danfeng, Jiang Min, Wang Lei, Akremi Mohamed Sanim, Tabia Hedi
School of Automation, Wuxi University, Wuxi, 214105, China.
IBISC, University of Evry, Paris-Saclay University, 91020, Evry, France.
Sci Rep. 2025 May 29;15(1):18780. doi: 10.1038/s41598-025-02461-6.
In various studies on skeleton-based action recognition, graph convolution has been widely used to extract crucial skeleton information. However, existing methods often face challenges in effectively capturing the motion relationships between distinct body parts, such as the coordination between the torso and limbs. To address these limitations, this paper constructs a novel Part and Attention enhanced Feature Fusion Network (PAFFNet). First, we construct center-of-gravity dynamics within semantic body parts to enhance motion representation and capture dynamic trends. Additionally, we design Joint-based Feature Fusion (JFF) stream and Bone-based Feature Fusion (BFF) stream to extract and fuse complementary information between constructed local parted and global skeleton features for enhanced motion representation. Furthermore, an Improved Adaptive Graph Convolution (IAGC) block with denoising and attention mechanisms is proposed to prioritize critical semantic features. Experiments conducted on the NTU RGB+D and Kinetics-Skeleton datasets demonstrate the effectiveness of PAFFNet in improving skeleton-based action recognition.
在各种基于骨架的动作识别研究中,图卷积已被广泛用于提取关键的骨架信息。然而,现有方法在有效捕捉不同身体部位之间的运动关系时,常常面临挑战,比如躯干和四肢之间的协调。为解决这些局限性,本文构建了一种新颖的部分与注意力增强特征融合网络(PAFFNet)。首先,我们在语义身体部位内构建重心动力学,以增强运动表示并捕捉动态趋势。此外,我们设计了基于关节的特征融合(JFF)流和基于骨骼的特征融合(BFF)流,以提取并融合构建的局部部分特征和全局骨架特征之间的互补信息,从而增强运动表示。此外,还提出了一种带有去噪和注意力机制的改进自适应图卷积(IAGC)模块,以突出关键语义特征。在NTU RGB+D和Kinetics-Skeleton数据集上进行的实验证明了PAFFNet在改进基于骨架的动作识别方面的有效性。