Suppr超能文献

用于行人重识别的具有二阶注意力的遮挡感知Transformer

Occlusion-Aware Transformer With Second-Order Attention for Person Re-Identification.

作者信息

Li Yanping, Liu Yizhang, Zhang Hongyun, Zhao Cairong, Wei Zhihua, Miao Duoqian

出版信息

IEEE Trans Image Process. 2024;33:3200-3211. doi: 10.1109/TIP.2024.3393360. Epub 2024 May 6.

Abstract

Person re-identification (ReID) typically encounters varying degrees of occlusion in real-world scenarios. While previous methods have addressed this using handcrafted partitions or external cues, they often compromise semantic information or increase network complexity. In this paper, we propose a new method from a novel perspective, termed as OAT. Specifically, we first use a Transformer backbone with multiple class tokens for diverse pedestrian feature learning. Given that the self-attention mechanism in the Transformer solely focuses on low-level feature correlations, neglecting higher-order relations among different body parts or regions. Thus, we propose the Second-Order Attention (SOA) module to capture more comprehensive features. To address computational efficiency, we further derive approximation formulations for implementing second-order attention. Observing that the importance of semantics associated with different class tokens varies due to the uncertainty of the location and size of occlusion, we propose the Entropy Guided Fusion (EGF) module for multiple class tokens. By conducting uncertainty analysis on each class token, higher weights are assigned to those with lower information entropy, while lower weights are assigned to class tokens with higher entropy. The dynamic weight adjustment can mitigate the impact of occlusion-induced uncertainty on feature learning, thereby facilitating the acquisition of discriminative class token representations. Extensive experiments have been conducted on occluded and holistic person re-identification datasets, which demonstrate the effectiveness of our proposed method.

摘要

行人重识别(ReID)在现实场景中通常会遇到不同程度的遮挡。虽然先前的方法使用手工划分或外部线索来解决这个问题,但它们往往会损害语义信息或增加网络复杂性。在本文中,我们从一个新颖的角度提出了一种新方法,称为OAT。具体来说,我们首先使用带有多个类别令牌的Transformer主干进行多样化的行人特征学习。鉴于Transformer中的自注意力机制仅关注低级特征相关性,而忽略了不同身体部位或区域之间的高阶关系。因此,我们提出了二阶注意力(SOA)模块来捕获更全面的特征。为了解决计算效率问题,我们进一步推导了用于实现二阶注意力的近似公式。观察到由于遮挡位置和大小的不确定性,与不同类别令牌相关的语义重要性会有所不同,我们为多个类别令牌提出了熵引导融合(EGF)模块。通过对每个类别令牌进行不确定性分析,将较高的权重分配给信息熵较低的令牌,而将较低的权重分配给信息熵较高的类别令牌。动态权重调整可以减轻遮挡引起的不确定性对特征学习的影响,从而有助于获得有区分力的类别令牌表示。我们在遮挡和整体行人重识别数据集上进行了广泛的实验,结果证明了我们提出的方法的有效性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验