Suppr超能文献

GNDAN:用于零样本学习的图导航双注意力网络。

GNDAN: Graph Navigated Dual Attention Network for Zero-Shot Learning.

作者信息

Chen Shiming, Hong Ziming, Xie Guosen, Peng Qinmu, You Xinge, Ding Weiping, Shao Ling

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Apr;35(4):4516-4529. doi: 10.1109/TNNLS.2022.3155602. Epub 2024 Apr 4.

Abstract

Zero-shot learning (ZSL) tackles the unseen class recognition problem by transferring semantic knowledge from seen classes to unseen ones. Typically, to guarantee desirable knowledge transfer, a direct embedding is adopted for associating the visual and semantic domains in ZSL. However, most existing ZSL methods focus on learning the embedding from implicit global features or image regions to the semantic space. Thus, they fail to: 1) exploit the appearance relationship priors between various local regions in a single image, which corresponds to the semantic information and 2) learn cooperative global and local features jointly for discriminative feature representations. In this article, we propose the novel graph navigated dual attention network (GNDAN) for ZSL to address these drawbacks. GNDAN employs a region-guided attention network (RAN) and a region-guided graph attention network (RGAT) to jointly learn a discriminative local embedding and incorporate global context for exploiting explicit global embeddings under the guidance of a graph. Specifically, RAN uses soft spatial attention to discover discriminative regions for generating local embeddings. Meanwhile, RGAT employs an attribute-based attention to obtain attribute-based region features, where each attribute focuses on the most relevant image regions. Motivated by the graph neural network (GNN), which is beneficial for structural relationship representations, RGAT further leverages a graph attention network to exploit the relationships between the attribute-based region features for explicit global embedding representations. Based on the self-calibration mechanism, the joint visual embedding learned is matched with the semantic embedding to form the final prediction. Extensive experiments on three benchmark datasets demonstrate that the proposed GNDAN achieves superior performances to the state-of-the-art methods. Our code and trained models are available at https://github.com/shiming-chen/GNDAN.

摘要

零样本学习(ZSL)通过将语义知识从已见类别转移到未见类别来解决未见类别识别问题。通常,为了保证理想的知识转移,在ZSL中采用直接嵌入来关联视觉和语义领域。然而,大多数现有的ZSL方法专注于从隐式全局特征或图像区域学习到语义空间的嵌入。因此,它们未能:1)利用单个图像中各个局部区域之间的外观关系先验,这与语义信息相对应;2)联合学习协作的全局和局部特征以获得判别性特征表示。在本文中,我们提出了用于ZSL的新颖的图导航双注意力网络(GNDAN)来解决这些缺点。GNDAN采用区域引导注意力网络(RAN)和区域引导图注意力网络(RGAT)来联合学习判别性局部嵌入,并在图的引导下纳入全局上下文以利用显式全局嵌入。具体而言,RAN使用软空间注意力来发现用于生成局部嵌入的判别性区域。同时,RGAT采用基于属性的注意力来获得基于属性的区域特征,其中每个属性专注于最相关的图像区域。受图神经网络(GNN)(有利于结构关系表示)的启发,RGAT进一步利用图注意力网络来利用基于属性的区域特征之间的关系以获得显式全局嵌入表示。基于自校准机制,将学习到的联合视觉嵌入与语义嵌入进行匹配以形成最终预测。在三个基准数据集上进行的大量实验表明,所提出的GNDAN比现有方法具有更优的性能。我们的代码和训练模型可在https://github.com/shiming-chen/GNDAN上获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验