• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于词自更新对比对抗网络的文本到图像合成。

Word self-update contrastive adversarial networks for text-to-image synthesis.

机构信息

College of Information and Communication Engineering, Harbin Engineering University, 150001, Harbin, China.

Institute for Artificial Intelligence, Peking University, Beijing, 100871, China.

出版信息

Neural Netw. 2023 Oct;167:433-444. doi: 10.1016/j.neunet.2023.08.038. Epub 2023 Aug 25.

DOI:10.1016/j.neunet.2023.08.038
PMID:37673029
Abstract

Synthesizing realistic fine-grained images from text descriptions is a significant computer vision task. Although many GANs-based methods have been proposed to solve this task, generating high-quality images consistent with text information remains a difficult problem. These existing GANs-based methods ignore important words due to the use of fixed initial word features in generator, and neglect to learn semantic consistency between images and texts for discriminators. In this article, we propose a novel attentional generation and contrastive adversarial framework for fine-grained text-to-image synthesis, termed as Word Self-Update Contrastive Adversarial Networks (WSC-GAN). Specifically, we introduce a dual attention module for modeling color details and semantic information. With a new designed word self-update module, the generator can leverage visually important words to compute attention maps in the feature synthesis module. Furthermore, we contrive multi-branch contrastive discriminators to maintain better consistency between the generated image and text description. Two novel contrastive losses are proposed for our discriminators to impose image-sentence and image-word consistency constraints. Extensive experiments on CUB and MS-COCO datasets demonstrate that our method achieves better performance compared with state-of-the-art methods.

摘要

从文本描述中合成逼真的细粒度图像是一项重要的计算机视觉任务。尽管已经提出了许多基于 GAN 的方法来解决这个任务,但生成与文本信息一致的高质量图像仍然是一个难题。这些现有的基于 GAN 的方法由于在生成器中使用固定的初始单词特征,因此忽略了重要单词,并且忽略了学习图像和文本之间的语义一致性。在本文中,我们提出了一种新颖的注意生成和对比对抗框架,用于细粒度的文本到图像合成,称为单词自更新对比对抗网络(WSC-GAN)。具体来说,我们引入了双注意模块来建模颜色细节和语义信息。通过新设计的单词自更新模块,生成器可以利用视觉上重要的单词来计算特征合成模块中的注意力图。此外,我们设计了多分支对比鉴别器,以保持生成图像和文本描述之间更好的一致性。我们的鉴别器提出了两种新颖的对比损失,以施加图像-句子和图像-单词一致性约束。在 CUB 和 MS-COCO 数据集上的广泛实验表明,与最先进的方法相比,我们的方法取得了更好的性能。

相似文献

1
Word self-update contrastive adversarial networks for text-to-image synthesis.基于词自更新对比对抗网络的文本到图像合成。
Neural Netw. 2023 Oct;167:433-444. doi: 10.1016/j.neunet.2023.08.038. Epub 2023 Aug 25.
2
SAM-GAN: Self-Attention supporting Multi-stage Generative Adversarial Networks for text-to-image synthesis.SAM-GAN:用于文本到图像合成的支持多阶段生成对抗网络的自注意力模型。
Neural Netw. 2021 Jun;138:57-67. doi: 10.1016/j.neunet.2021.01.023. Epub 2021 Feb 10.
3
Cycle contrastive adversarial learning with structural consistency for unsupervised high-quality image deraining transformer.用于无监督高质量图像去雨的具有结构一致性的循环对比对抗学习变压器
Neural Netw. 2024 Oct;178:106428. doi: 10.1016/j.neunet.2024.106428. Epub 2024 Jun 4.
4
Multi-Sentence Auxiliary Adversarial Networks for Fine-Grained Text-to-Image Synthesis.用于细粒度文本到图像合成的多句辅助对抗网络。
IEEE Trans Image Process. 2021;30:2798-2809. doi: 10.1109/TIP.2021.3055062. Epub 2021 Feb 12.
5
Unsupervised domain adaptive building semantic segmentation network by edge-enhanced contrastive learning.基于边缘增强对比学习的无监督领域自适应建筑语义分割网络。
Neural Netw. 2024 Nov;179:106581. doi: 10.1016/j.neunet.2024.106581. Epub 2024 Jul 30.
6
DualG-GAN, a Dual-channel Generator based Generative Adversarial Network for text-to-face synthesis.基于双通道生成器的生成对抗网络 DualG-GAN 文本到人脸的合成。
Neural Netw. 2022 Nov;155:155-167. doi: 10.1016/j.neunet.2022.08.016. Epub 2022 Aug 19.
7
KT-GAN: Knowledge-Transfer Generative Adversarial Network for Text-to-Image Synthesis.KT-GAN:用于文本到图像合成的知识转移生成对抗网络。
IEEE Trans Image Process. 2021;30:1275-1290. doi: 10.1109/TIP.2020.3026728. Epub 2020 Dec 23.
8
Multi-Grained Radiology Report Generation With Sentence-Level Image-Language Contrastive Learning.基于句子级图像-语言对比学习的多粒度放射学报告生成
IEEE Trans Med Imaging. 2024 Jul;43(7):2657-2669. doi: 10.1109/TMI.2024.3372638. Epub 2024 Jul 1.
9
Unsupervised Bidirectional Contrastive Reconstruction and Adaptive Fine-Grained Channel Attention Networks for image dehazing.无监督双向对比重建与自适应细粒度通道注意力网络在图像去雾中的应用。
Neural Netw. 2024 Aug;176:106314. doi: 10.1016/j.neunet.2024.106314. Epub 2024 Apr 14.
10
Image Generation from Text Using StackGAN with Improved Conditional Consistency Regularization.使用具有改进条件一致性正则化的StackGAN从文本生成图像
Sensors (Basel). 2022 Dec 26;23(1):249. doi: 10.3390/s23010249.