用于行人重识别的具有二阶注意力的遮挡感知Transformer

Occlusion-Aware Transformer With Second-Order Attention for Person Re-Identification.

作者信息

Li Yanping, Liu Yizhang, Zhang Hongyun, Zhao Cairong, Wei Zhihua, Miao Duoqian

出版信息

IEEE Trans Image Process. 2024;33:3200-3211. doi: 10.1109/TIP.2024.3393360. Epub 2024 May 6.

DOI:10.1109/TIP.2024.3393360

Abstract

Person re-identification (ReID) typically encounters varying degrees of occlusion in real-world scenarios. While previous methods have addressed this using handcrafted partitions or external cues, they often compromise semantic information or increase network complexity. In this paper, we propose a new method from a novel perspective, termed as OAT. Specifically, we first use a Transformer backbone with multiple class tokens for diverse pedestrian feature learning. Given that the self-attention mechanism in the Transformer solely focuses on low-level feature correlations, neglecting higher-order relations among different body parts or regions. Thus, we propose the Second-Order Attention (SOA) module to capture more comprehensive features. To address computational efficiency, we further derive approximation formulations for implementing second-order attention. Observing that the importance of semantics associated with different class tokens varies due to the uncertainty of the location and size of occlusion, we propose the Entropy Guided Fusion (EGF) module for multiple class tokens. By conducting uncertainty analysis on each class token, higher weights are assigned to those with lower information entropy, while lower weights are assigned to class tokens with higher entropy. The dynamic weight adjustment can mitigate the impact of occlusion-induced uncertainty on feature learning, thereby facilitating the acquisition of discriminative class token representations. Extensive experiments have been conducted on occluded and holistic person re-identification datasets, which demonstrate the effectiveness of our proposed method.

摘要

行人重识别（ReID）在现实场景中通常会遇到不同程度的遮挡。虽然先前的方法使用手工划分或外部线索来解决这个问题，但它们往往会损害语义信息或增加网络复杂性。在本文中，我们从一个新颖的角度提出了一种新方法，称为OAT。具体来说，我们首先使用带有多个类别令牌的Transformer主干进行多样化的行人特征学习。鉴于Transformer中的自注意力机制仅关注低级特征相关性，而忽略了不同身体部位或区域之间的高阶关系。因此，我们提出了二阶注意力（SOA）模块来捕获更全面的特征。为了解决计算效率问题，我们进一步推导了用于实现二阶注意力的近似公式。观察到由于遮挡位置和大小的不确定性，与不同类别令牌相关的语义重要性会有所不同，我们为多个类别令牌提出了熵引导融合（EGF）模块。通过对每个类别令牌进行不确定性分析，将较高的权重分配给信息熵较低的令牌，而将较低的权重分配给信息熵较高的类别令牌。动态权重调整可以减轻遮挡引起的不确定性对特征学习的影响，从而有助于获得有区分力的类别令牌表示。我们在遮挡和整体行人重识别数据集上进行了广泛的实验，结果证明了我们提出的方法的有效性。

相似文献

Occlusion-Aware Transformer With Second-Order Attention for Person Re-Identification.用于行人重识别的具有二阶注意力的遮挡感知Transformer

IEEE Trans Image Process. 2024;33:3200-3211. doi: 10.1109/TIP.2024.3393360. Epub 2024 May 6.

A Multi-Level Relation-Aware Transformer model for occluded person re-identification.一种用于遮挡行人再识别的多层次关系感知 Transformer 模型。

Neural Netw. 2024 Sep;177:106382. doi: 10.1016/j.neunet.2024.106382. Epub 2024 May 9.

MCTformer+: Multi-Class Token Transformer for Weakly Supervised Semantic Segmentation.MCTformer+：用于弱监督语义分割的多类令牌变换器

IEEE Trans Pattern Anal Mach Intell. 2024 Dec;46(12):8380-8395. doi: 10.1109/TPAMI.2024.3404422. Epub 2024 Nov 6.

Multi-Scale Efficient Graph-Transformer for Whole Slide Image Classification.多尺度高效图Transformer 用于全幻灯片图像分类。

IEEE J Biomed Health Inform. 2023 Dec;27(12):5926-5936. doi: 10.1109/JBHI.2023.3317067. Epub 2023 Dec 5.

Structure-Aware Positional Transformer for Visible-Infrared Person Re-Identification.基于结构感知的可见光-红外跨模态行人重识别的位置变换模型

IEEE Trans Image Process. 2022;31:2352-2364. doi: 10.1109/TIP.2022.3141868. Epub 2022 Mar 15.

Learning Feature Recovery Transformer for Occluded Person Re-Identification.用于遮挡行人重识别的学习特征恢复Transformer

IEEE Trans Image Process. 2022;31:4651-4662. doi: 10.1109/TIP.2022.3186759. Epub 2022 Jul 12.

Heterogeneous feature-aware Transformer-CNN coupling network for person re-identification.用于行人重识别的异构特征感知Transformer-CNN耦合网络

PeerJ Comput Sci. 2022 Sep 27;8:e1098. doi: 10.7717/peerj-cs.1098. eCollection 2022.

Semantic-Aware Message Broadcasting for Efficient Unsupervised Domain Adaptation.用于高效无监督域适应的语义感知消息广播

IEEE Trans Image Process. 2024;33:5340-5353. doi: 10.1109/TIP.2024.3437212. Epub 2024 Oct 2.

TS-CAM: Token Semantic Coupled Attention Map for Weakly Supervised Object Localization.TS-CAM：用于弱监督目标定位的令牌语义耦合注意力图

IEEE Trans Neural Netw Learn Syst. 2024 Jul;35(7):9109-9121. doi: 10.1109/TNNLS.2022.3218471. Epub 2024 Jul 8.

A Semantic-Aware Attention and Visual Shielding Network for Cloth-Changing Person Re-Identification.一种用于换衣人物重新识别的语义感知注意力与视觉屏蔽网络。

IEEE Trans Neural Netw Learn Syst. 2025 Jan;36(1):1243-1257. doi: 10.1109/TNNLS.2023.3329384. Epub 2025 Jan 7.

引用本文的文献

AIRHF-Net: an adaptive interaction representation hierarchical fusion network for occluded person re-identification.AIRHF-Net：一种用于遮挡行人重识别的自适应交互表示层次融合网络。

Sci Rep. 2024 Nov 8;14(1):27242. doi: 10.1038/s41598-024-76781-4.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于行人重识别的具有二阶注意力的遮挡感知Transformer

Occlusion-Aware Transformer With Second-Order Attention for Person Re-Identification.

作者信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献