Suppr超能文献

基于图采样的多流增强网络用于可见光-红外行人重识别

Graph Sampling-Based Multi-Stream Enhancement Network for Visible-Infrared Person Re-Identification.

作者信息

Jiang Jinhua, Xiao Junjie, Wang Renlin, Li Tiansong, Zhang Wenfeng, Ran Ruisheng, Xiang Sen

机构信息

College of Computer and Information Science, Chongqing Normal University, Chongqing 401331, China.

School of Computer Engineering, Weifang University, Weifang 261061, China.

出版信息

Sensors (Basel). 2023 Sep 18;23(18):7948. doi: 10.3390/s23187948.

Abstract

With the increasing demand for person re-identification (Re-ID) tasks, the need for all-day retrieval has become an inevitable trend. Nevertheless, single-modal Re-ID is no longer sufficient to meet this requirement, making Multi-Modal Data crucial in Re-ID. Consequently, a Visible-Infrared Person Re-Identification (VI Re-ID) task is proposed, which aims to match pairs of person images from the visible and infrared modalities. The significant modality discrepancy between the modalities poses a major challenge. Existing VI Re-ID methods focus on cross-modal feature learning and modal transformation to alleviate the discrepancy but overlook the impact of person contour information. Contours exhibit modality invariance, which is vital for learning effective identity representations and cross-modal matching. In addition, due to the low intra-modal diversity in the visible modality, it is difficult to distinguish the boundaries between some hard samples. To address these issues, we propose the Graph Sampling-based Multi-stream Enhancement Network (GSMEN). Firstly, the Contour Expansion Module (CEM) incorporates the contour information of a person into the original samples, further reducing the modality discrepancy and leading to improved matching stability between image pairs of different modalities. Additionally, to better distinguish cross-modal hard sample pairs during the training process, an innovative Cross-modality Graph Sampler (CGS) is designed for sample selection before training. The CGS calculates the feature distance between samples from different modalities and groups similar samples into the same batch during the training process, effectively exploring the boundary relationships between hard classes in the cross-modal setting. Some experiments conducted on the SYSU-MM01 and RegDB datasets demonstrate the superiority of our proposed method. Specifically, in the VIS→IR task, the experimental results on the RegDB dataset achieve 93.69% for Rank-1 and 92.56% for mAP.

摘要

随着对人员重新识别(Re-ID)任务的需求不断增加,全天候检索的需求已成为必然趋势。然而,单模态Re-ID已不足以满足这一要求,使得多模态数据在Re-ID中至关重要。因此,提出了一种可见光-红外人员重新识别(VI Re-ID)任务,旨在匹配来自可见光和红外模态的人员图像对。模态之间显著的模态差异带来了重大挑战。现有的VI Re-ID方法侧重于跨模态特征学习和模态转换以减轻差异,但忽略了人体轮廓信息的影响。轮廓表现出模态不变性,这对于学习有效的身份表示和跨模态匹配至关重要。此外,由于可见光模态中的模态内多样性较低,难以区分一些困难样本之间的边界。为了解决这些问题,我们提出了基于图采样的多流增强网络(GSMEN)。首先,轮廓扩展模块(CEM)将人体的轮廓信息纳入原始样本中,进一步减少模态差异,并提高不同模态图像对之间的匹配稳定性。此外,为了在训练过程中更好地区分跨模态困难样本对,设计了一种创新的跨模态图采样器(CGS)用于训练前的样本选择。CGS在训练过程中计算来自不同模态的样本之间的特征距离,并将相似样本分组到同一批次中,有效地探索了跨模态设置中困难类别之间的边界关系。在SYSU-MM01和RegDB数据集上进行的一些实验证明了我们提出的方法的优越性。具体而言,在VIS→IR任务中,RegDB数据集上的实验结果在Rank-1时达到93.69%,在mAP时达到92.56%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a09d/10534846/3b56d0fec3d2/sensors-23-07948-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验