• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

联合特征合成与嵌入:重新审视对抗性跨模态检索

Joint Feature Synthesis and Embedding: Adversarial Cross-Modal Retrieval Revisited.

作者信息

Xu Xing, Lin Kaiyi, Yang Yang, Hanjalic Alan, Shen Heng Tao

出版信息

IEEE Trans Pattern Anal Mach Intell. 2022 Jun;44(6):3030-3047. doi: 10.1109/TPAMI.2020.3045530. Epub 2022 May 5.

DOI:10.1109/TPAMI.2020.3045530
PMID:33332264
Abstract

Recently, generative adversarial network (GAN) has shown its strong ability on modeling data distribution via adversarial learning. Cross-modal GAN, which attempts to utilize the power of GAN to model the cross-modal joint distribution and to learn compatible cross-modal features, is becoming the research hotspot. However, the existing cross-modal GAN approaches typically 1) require labeled multimodal data of massive labor cost to establish cross-modal correlation; 2) utilize the vanilla GAN model that results in unstable training procedure and meaningless synthetic features; and 3) lack of extensibility for retrieving cross-modal data of new classes. In this article, we revisit the adversarial learning in existing cross-modal GAN methods and propose Joint Feature Synthesis and Embedding (JFSE), a novel method that jointly performs multimodal feature synthesis and common embedding space learning to overcome the above three shortcomings. Specifically, JFSE deploys two coupled conditional Wassertein GAN modules for the input data of two modalities, to synthesize meaningful and correlated multimodal features under the guidance of the word embeddings of class labels. Moreover, three advanced distribution alignment schemes with advanced cycle-consistency constraints are proposed to preserve the semantic compatibility and enable the knowledge transfer in the common embedding space for both the true and synthetic cross-modal features. All these add-ons in JFSE not only help to learn more effective common embedding space that effectively captures the cross-modal correlation but also facilitate to transfer knowledge to multimodal data of new classes. Extensive experiments are conducted on four widely used cross-modal datasets, and the comparisons with more than ten state-of-the-art approaches show that our JFSE method achieves remarkably accuracy improvement on both standard retrieval and the newly explored zero-shot and generalized zero-shot retrieval tasks.

摘要

最近,生成对抗网络(GAN)已通过对抗学习展现出其在数据分布建模方面的强大能力。跨模态GAN试图利用GAN的能力来对跨模态联合分布进行建模并学习兼容的跨模态特征,正成为研究热点。然而,现有的跨模态GAN方法通常:1)需要大量人工标注的多模态数据来建立跨模态相关性;2)使用普通的GAN模型,导致训练过程不稳定且生成的特征无意义;3)缺乏可扩展性,无法检索新类别的跨模态数据。在本文中,我们重新审视了现有跨模态GAN方法中的对抗学习,并提出了联合特征合成与嵌入(JFSE)方法,这是一种新颖的方法,可联合执行多模态特征合成和公共嵌入空间学习,以克服上述三个缺点。具体而言,JFSE为两种模态的输入数据部署了两个耦合的条件瓦瑟斯坦GAN模块,以在类别标签的词嵌入指导下合成有意义且相关的多模态特征。此外,还提出了三种具有高级循环一致性约束的先进分布对齐方案,以保持语义兼容性,并使真实和合成的跨模态特征在公共嵌入空间中实现知识转移。JFSE中的所有这些附加功能不仅有助于学习更有效的公共嵌入空间,有效捕捉跨模态相关性,还便于将知识转移到新类别的多模态数据中。我们在四个广泛使用的跨模态数据集上进行了大量实验,与十多种最新方法的比较表明,我们的JFSE方法在标准检索以及新探索的零样本和广义零样本检索任务上均取得了显著的准确率提升。

相似文献

1
Joint Feature Synthesis and Embedding: Adversarial Cross-Modal Retrieval Revisited.联合特征合成与嵌入:重新审视对抗性跨模态检索
IEEE Trans Pattern Anal Mach Intell. 2022 Jun;44(6):3030-3047. doi: 10.1109/TPAMI.2020.3045530. Epub 2022 May 5.
2
Ternary Adversarial Networks With Self-Supervision for Zero-Shot Cross-Modal Retrieval.用于零样本跨模态检索的具有自监督的三元对抗网络。
IEEE Trans Cybern. 2020 Jun;50(6):2400-2413. doi: 10.1109/TCYB.2019.2928180. Epub 2019 Jul 24.
3
Cross-modal distribution alignment embedding network for generalized zero-shot learning.跨模态分布对齐嵌入网络的广义零样本学习。
Neural Netw. 2022 Apr;148:176-182. doi: 10.1016/j.neunet.2022.01.007. Epub 2022 Jan 29.
4
Modality independent adversarial network for generalized zero shot image classification.模态无关对抗网络的广义零样本图像分类。
Neural Netw. 2021 Feb;134:11-22. doi: 10.1016/j.neunet.2020.11.007. Epub 2020 Nov 21.
5
Investigating the Bilateral Connections in Generative Zero-Shot Learning.探索生成式零样本学习中的双边连接。
IEEE Trans Cybern. 2022 Aug;52(8):8167-8178. doi: 10.1109/TCYB.2021.3050803. Epub 2022 Jul 19.
6
Augmented semantic feature based generative network for generalized zero-shot learning.基于增强语义特征的生成网络用于广义零样本学习。
Neural Netw. 2021 Nov;143:1-11. doi: 10.1016/j.neunet.2021.04.014. Epub 2021 Apr 21.
7
SCH-GAN: Semi-Supervised Cross-Modal Hashing by Generative Adversarial Network.SCH-GAN:基于生成对抗网络的半监督跨模态哈希。
IEEE Trans Cybern. 2020 Feb;50(2):489-502. doi: 10.1109/TCYB.2018.2868826. Epub 2018 Sep 26.
8
MHTN: Modal-Adversarial Hybrid Transfer Network for Cross-Modal Retrieval.MHTN:用于跨模态检索的模态对抗混合转移网络。
IEEE Trans Cybern. 2020 Mar;50(3):1047-1059. doi: 10.1109/TCYB.2018.2879846. Epub 2018 Dec 5.
9
Common feature learning for brain tumor MRI synthesis by context-aware generative adversarial network.基于上下文感知生成对抗网络的脑肿瘤 MRI 合成的通用特征学习。
Med Image Anal. 2022 Jul;79:102472. doi: 10.1016/j.media.2022.102472. Epub 2022 May 4.
10
Bridging multimedia heterogeneity gap via Graph Representation Learning for cross-modal retrieval.通过图表示学习弥合多媒体异质鸿沟进行跨模态检索。
Neural Netw. 2021 Feb;134:143-162. doi: 10.1016/j.neunet.2020.11.011. Epub 2020 Nov 28.

引用本文的文献

1
Enhanced Soft Sensor with Qualified Augmented Samples for Quality Prediction of the Polyethylene Process.用于聚乙烯过程质量预测的具有合格增强样本的增强型软传感器
Polymers (Basel). 2022 Nov 7;14(21):4769. doi: 10.3390/polym14214769.