Suppr超能文献

TransZero++:用于零样本学习的跨属性引导变换器

TransZero++: Cross Attribute-Guided Transformer for Zero-Shot Learning.

作者信息

Chen Shiming, Hong Ziming, Hou Wenjin, Xie Guo-Sen, Song Yibing, Zhao Jian, You Xinge, Yan Shuicheng, Shao Ling

出版信息

IEEE Trans Pattern Anal Mach Intell. 2023 Nov;45(11):12844-12861. doi: 10.1109/TPAMI.2022.3229526. Epub 2023 Oct 3.

Abstract

Zero-shot learning (ZSL) tackles the novel class recognition problem by transferring semantic knowledge from seen classes to unseen ones. Semantic knowledge is typically represented by attribute descriptions shared between different classes, which act as strong priors for localizing object attributes that represent discriminative region features, enabling significant and sufficient visual-semantic interaction for advancing ZSL. Existing attention-based models have struggled to learn inferior region features in a single image by solely using unidirectional attention, which ignore the transferable and discriminative attribute localization of visual features for representing the key semantic knowledge for effective knowledge transfer in ZSL. In this paper, we propose a cross attribute-guided Transformer network, termed TransZero++, to refine visual features and learn accurate attribute localization for key semantic knowledge representations in ZSL. Specifically, TransZero++ employs an attribute → visual Transformer sub-net (AVT) and a visual → attribute Transformer sub-net (VAT) to learn attribute-based visual features and visual-based attribute features, respectively. By further introducing feature-level and prediction-level semantical collaborative losses, the two attribute-guided transformers teach each other to learn semantic-augmented visual embeddings for key semantic knowledge representations via semantical collaborative learning. Finally, the semantic-augmented visual embeddings learned by AVT and VAT are fused to conduct desirable visual-semantic interaction cooperated with class semantic vectors for ZSL classification. Extensive experiments show that TransZero++ achieves the new state-of-the-art results on three golden ZSL benchmarks and on the large-scale ImageNet dataset. The project website is available at: https://shiming-chen.github.io/TransZero-pp/TransZero-pp.html.

摘要

零样本学习(ZSL)通过将语义知识从已见类别转移到未见类别来解决新类别识别问题。语义知识通常由不同类别之间共享的属性描述来表示,这些属性描述作为定位表示判别区域特征的对象属性的强大先验知识,从而实现显著且充分的视觉-语义交互以推动零样本学习的发展。现有的基于注意力的模型仅通过单向注意力在单个图像中学习劣质区域特征,而忽略了视觉特征的可转移和判别性属性定位,无法表示零样本学习中有效知识转移的关键语义知识。在本文中,我们提出了一种跨属性引导的Transformer网络,称为TransZero++,以优化视觉特征并学习零样本学习中关键语义知识表示的准确属性定位。具体而言,TransZero++采用属性→视觉Transformer子网(AVT)和视觉→属性Transformer子网(VAT)分别学习基于属性的视觉特征和基于视觉的属性特征。通过进一步引入特征级和预测级语义协作损失,这两个属性引导的Transformer通过语义协作学习相互教导,以学习用于关键语义知识表示的语义增强视觉嵌入。最后,将AVT和VAT学习到的语义增强视觉嵌入融合起来,与类别语义向量协作进行理想的视觉-语义交互,用于零样本学习分类。大量实验表明,TransZero++在三个黄金零样本学习基准和大规模ImageNet数据集上取得了新的最优结果。项目网站为:https://shiming-chen.github.io/TransZero-pp/TransZero-pp.html

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验