Suppr超能文献

基于注意力区域嵌入和增强语义的零样本学习

Zero-Shot Learning With Attentive Region Embedding and Enhanced Semantics.

作者信息

Liu Yang, Dang Yuhao, Gao Xinbo, Han Jungong, Shao Ling

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Mar;35(3):4220-4231. doi: 10.1109/TNNLS.2022.3202014. Epub 2024 Feb 29.

Abstract

The performance of zero-shot learning (ZSL) can be improved progressively by learning better features and generating pseudosamples for unseen classes. Existing ZSL works typically learn feature extractors and generators independently, which may shift the unseen samples away from their real distribution and suffers from the domain bias problem. In this article, to tackle this challenge, we propose a variational autoencoder (VAE)-based framework, that is, joint Attentive Region Embedding with Enhanced Semantics (AREES), which is tailored to advance the zero-shot recognition. Specifically, AREES is end-to-end trainable and consists of three network branches: 1) attentive region embedding is used to learn the semantic-guided visual features by the attention mechanism (AM); 2) a decomposition structure and a semantic pivot regularization are used to extract enhanced semantics; and 3) a multimodal VAE (mVAE) with the cross-reconstruction loss and the distribution alignment loss is used to obtain a shared latent embedding space of visual features and semantics. Finally, features' extraction and features' generation are optimized together in AREES to address the domain shift problem to a large extent. The comprehensive evaluations on six benchmarks, including the ImageNet, demonstrate the superiority of the proposed model over its state-of-the-art counterparts.

摘要

通过学习更好的特征并为未见类别生成伪样本,可以逐步提高零样本学习(ZSL)的性能。现有的ZSL工作通常独立学习特征提取器和生成器,这可能会使未见样本偏离其真实分布,并存在域偏差问题。在本文中,为应对这一挑战,我们提出了一种基于变分自编码器(VAE)的框架,即联合注意力区域嵌入与增强语义(AREES),该框架专为推进零样本识别而设计。具体而言,AREES是端到端可训练的,由三个网络分支组成:1)注意力区域嵌入用于通过注意力机制(AM)学习语义引导的视觉特征;2)分解结构和语义枢轴正则化用于提取增强语义;3)具有交叉重建损失和分布对齐损失的多模态VAE(mVAE)用于获得视觉特征和语义的共享潜在嵌入空间。最后,在AREES中一起优化特征提取和特征生成,以在很大程度上解决域转移问题。在包括ImageNet在内的六个基准上的综合评估证明了所提出模型优于其同类的现有技术。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验