Wei Siwei, Xu Xiangyuan, Liu Dewen, Wang Chunzhi, Yan Lingyu, Wu Wangyu
School of Computer Science, Hubei University of Technology, Wuhan 430068, China.
CCCC Second Harbor Engineering Company Ltd., Wuhan 430056, China.
Sensors (Basel). 2025 Jun 16;25(12):3759. doi: 10.3390/s25123759.
Gait recognition, as a non-contact biometric technology, offers unique advantages in scenarios requiring long-distance identification without active cooperation from subjects. However, existing gait recognition methods predominantly rely on single-modal data, which demonstrates insufficient feature expression capabilities when confronted with complex factors in real-world environments, including viewpoint variations, clothing differences, occlusion problems, and illumination changes. This paper addresses these challenges by introducing a multi-modal gait recognition network based on channel shuffle regulation and spatial-frequency joint learning, which integrates two complementary modalities (silhouette data and heatmap data) to construct a more comprehensive gait representation. The channel shuffle-based feature selective regulation module achieves cross-channel information interaction and feature enhancement through channel grouping and feature shuffling strategies. This module divides input features along the channel dimension into multiple subspaces, which undergo channel-aware and spatial-aware processing to capture dependency relationships across different dimensions. Subsequently, channel shuffling operations facilitate information exchange between different semantic groups, achieving adaptive enhancement and optimization of features with relatively low parameter overhead. The spatial-frequency joint learning module maps spatiotemporal features to the spectral domain through fast Fourier transform, effectively capturing inherent periodic patterns and long-range dependencies in gait sequences. The global receptive field advantage of frequency domain processing enables the model to transcend local spatiotemporal constraints and capture global motion patterns. Concurrently, the spatial domain processing branch balances the contributions of frequency and spatial domain information through an adaptive weighting mechanism, maintaining computational efficiency while enhancing features. Experimental results demonstrate that the proposed GaitCSF model achieves significant performance improvements on mainstream datasets including GREW, Gait3D, and SUSTech1k, breaking through the performance bottlenecks of traditional methods. The implications of this research are significant for improving the performance and robustness of gait recognition systems when implemented in practical application scenarios.
步态识别作为一种非接触式生物识别技术,在需要远距离识别且无需主体主动配合的场景中具有独特优势。然而,现有的步态识别方法主要依赖单模态数据,在面对现实环境中的复杂因素时,其特征表达能力不足,这些因素包括视角变化、服装差异、遮挡问题和光照变化。本文通过引入一种基于通道混洗调节和空间频率联合学习的多模态步态识别网络来应对这些挑战,该网络整合了两种互补模态(轮廓数据和热图数据)以构建更全面的步态表示。基于通道混洗的特征选择性调节模块通过通道分组和特征混洗策略实现跨通道信息交互和特征增强。该模块将输入特征沿通道维度划分为多个子空间,对其进行通道感知和空间感知处理,以捕获不同维度之间的依赖关系。随后,通道混洗操作促进不同语义组之间的信息交换,以相对较低的参数开销实现特征的自适应增强和优化。空间频率联合学习模块通过快速傅里叶变换将时空特征映射到频域,有效捕获步态序列中固有的周期性模式和长程依赖关系。频域处理的全局感受野优势使模型能够超越局部时空约束并捕获全局运动模式。同时,空间域处理分支通过自适应加权机制平衡频域和空间域信息的贡献,在增强特征的同时保持计算效率。实验结果表明,所提出的GaitCSF模型在包括GREW、Gait3D和SUSTech1k在内的主流数据集上取得了显著的性能提升,突破了传统方法的性能瓶颈。本研究对于在实际应用场景中实现时提高步态识别系统的性能和鲁棒性具有重要意义。