Suppr超能文献

BiFFN:用于可见光-红外人体重识别的双频引导特征融合网络。

BiFFN: Bi-Frequency Guided Feature Fusion Network for Visible-Infrared Person Re-Identification.

作者信息

Cao Xingyu, Ding Pengxin, Li Jie, Chen Mei

机构信息

School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China.

出版信息

Sensors (Basel). 2025 Feb 20;25(5):1298. doi: 10.3390/s25051298.

Abstract

Visible-infrared person re-identification (VI-ReID) aims to minimize the modality gaps of pedestrian images across different modalities. Existing methods primarily focus on extracting cross-modality features from the spatial domain, which often limits the comprehensive extraction of useful information. Compared with conventional approaches that either focus on single-frequency components or employ simple multi-branch fusion strategies, our method fundamentally addresses the modality discrepancy through systematic frequency-space co-learning. To address this limitation, we propose a novel bi-frequency feature fusion network (BiFFN) that effectively extracts and fuses features from both high- and low-frequency domains and spatial domain features to reduce modality gaps. The network introduces a frequency-spatial enhancement (FSE) module to enhance feature representation across both domains. Additionally, the deep frequency mining (DFM) module optimizes cross-modality information utilization by leveraging distinct features of high- and low-frequency features. The cross-frequency fusion (CFF) module further aligns low-frequency features and fuses them with high-frequency features to generate middle features that incorporate critical information from each modality. To refine the distribution of identity features in the common space, we develop a unified modality center (UMC) loss, which promotes a more balanced inter-modality distribution while preserving discriminative identity information. Extensive experiments demonstrate that the proposed BiFFN achieves state-of-the-art performance in VI-ReID. Specifically, our method achieved a Rank-1 accuracy of 77.5% and an mAP of 75.9% on the SYSU-MM01 dataset under the all-search mode. Additionally, it achieved a Rank-1 accuracy of 58.5% and an mAP of 63.7% on the LLCM dataset under the IR-VIS mode. These improvements verify that our model, with the integration of feature fusion and the incorporation of frequency domains, significantly reduces modality gaps and outperforms previous methods.

摘要

可见-红外行人重识别(VI-ReID)旨在最小化不同模态下行人图像的模态差距。现有方法主要集中于从空间域提取跨模态特征,这往往限制了有用信息的全面提取。与专注于单频分量或采用简单多分支融合策略的传统方法相比,我们的方法通过系统的频率-空间协同学习从根本上解决了模态差异问题。为解决这一局限性,我们提出了一种新颖的双频特征融合网络(BiFFN),它能有效提取和融合高频域、低频域以及空间域特征,以减少模态差距。该网络引入了一个频率-空间增强(FSE)模块来增强跨两个域的特征表示。此外,深度频率挖掘(DFM)模块通过利用高频和低频特征的不同特性来优化跨模态信息利用。跨频率融合(CFF)模块进一步对齐低频特征并将其与高频特征融合,以生成包含每个模态关键信息的中间特征。为了优化公共空间中身份特征的分布,我们开发了一种统一模态中心(UMC)损失,它在保留有区分性的身份信息的同时促进更平衡的模态间分布。大量实验表明,所提出的BiFFN在VI-ReID中取得了领先的性能。具体而言,在全搜索模式下,我们的方法在SYSU-MM01数据集上实现了77.5%的Rank-1准确率和75.9%的平均精度均值(mAP)。此外,在IR-VIS模式下,它在LLCM数据集上实现了58.5%的Rank-1准确率和63.7%的mAP。这些改进验证了我们的模型通过集成特征融合和纳入频率域,显著减少了模态差距并优于先前的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d63/11902842/b7c9e6c3aea0/sensors-25-01298-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验