Bi Yihan, Wang Rong, Zhou Qianli, Zeng Zhaolong, Lin Ronghui, Wang Mingjie
School of Information and Cyber Security, People's Public Security University of China, Beijing 100038, China.
Key Laboratory of Security Prevention Technology and Risk Assessment of Ministry of Public Security, Beijing 100038, China.
Entropy (Basel). 2024 Aug 13;26(8):681. doi: 10.3390/e26080681.
In order to minimize the disparity between visible and infrared modalities and enhance pedestrian feature representation, a cross-modality person re-identification method is proposed, which integrates modality generation and feature enhancement. Specifically, a lightweight network is used for dimension reduction and augmentation of visible images, and intermediate modalities are generated to bridge the gap between visible images and infrared images. The Convolutional Block Attention Module is embedded into the ResNet50 backbone network to selectively emphasize key features sequentially from both channel and spatial dimensions. Additionally, the Gradient Centralization algorithm is introduced into the Stochastic Gradient Descent optimizer to accelerate convergence speed and improve generalization capability of the network model. Experimental results on SYSU-MM01 and RegDB datasets demonstrate that our improved network model achieves significant performance gains, with an increase in Rank-1 accuracy of 7.12% and 6.34%, as well as an improvement in mAP of 4.00% and 6.05%, respectively.
为了最小化可见光和红外模态之间的差异并增强行人特征表示,提出了一种跨模态行人重识别方法,该方法集成了模态生成和特征增强。具体而言,使用轻量级网络对可见光图像进行降维和增强,生成中间模态以弥合可见光图像和红外图像之间的差距。将卷积块注意力模块嵌入到ResNet50骨干网络中,从通道和空间维度依次选择性地强调关键特征。此外,将梯度集中算法引入随机梯度下降优化器中,以加速收敛速度并提高网络模型的泛化能力。在SYSU-MM01和RegDB数据集上的实验结果表明,我们改进的网络模型取得了显著的性能提升,Rank-1准确率分别提高了7.12%和6.34%,mAP分别提高了4.00%和6.05%。