• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

KT-GAN:用于文本到图像合成的知识转移生成对抗网络。

KT-GAN: Knowledge-Transfer Generative Adversarial Network for Text-to-Image Synthesis.

作者信息

Tan Hongchen, Liu Xiuping, Liu Meng, Yin Baocai, Li Xin

出版信息

IEEE Trans Image Process. 2021;30:1275-1290. doi: 10.1109/TIP.2020.3026728. Epub 2020 Dec 23.

DOI:10.1109/TIP.2020.3026728
PMID:33001801
Abstract

This paper presents a new framework, Knowledge-Transfer Generative Adversarial Network (KT-GAN), for fine-grained text-to-image generation. We introduce two novel mechanisms: an Alternate Attention-Transfer Mechanism (AATM) and a Semantic Distillation Mechanism (SDM), to help generator better bridge the cross-domain gap between text and image. The AATM updates word attention weights and attention weights of image sub-regions alternately, to progressively highlight important word information and enrich details of synthesized images. The SDM uses the image encoder trained in the Image-to-Image task to guide training of the text encoder in the Text-to-Image task, for generating better text features and higher-quality images. With extensive experimental validation on two public datasets, our KT-GAN outperforms the baseline method significantly, and also achieves the competive results over different evaluation metrics.

摘要

本文提出了一种用于细粒度文本到图像生成的新框架——知识转移生成对抗网络(KT-GAN)。我们引入了两种新颖的机制:交替注意力转移机制(AATM)和语义蒸馏机制(SDM),以帮助生成器更好地弥合文本与图像之间的跨域差距。AATM交替更新单词注意力权重和图像子区域的注意力权重,以逐步突出重要单词信息并丰富合成图像的细节。SDM使用在图像到图像任务中训练的图像编码器来指导文本到图像任务中的文本编码器训练,以生成更好的文本特征和更高质量的图像。通过在两个公共数据集上进行广泛的实验验证,我们的KT-GAN显著优于基线方法,并且在不同评估指标上也取得了有竞争力的结果。

相似文献

1
KT-GAN: Knowledge-Transfer Generative Adversarial Network for Text-to-Image Synthesis.KT-GAN:用于文本到图像合成的知识转移生成对抗网络。
IEEE Trans Image Process. 2021;30:1275-1290. doi: 10.1109/TIP.2020.3026728. Epub 2020 Dec 23.
2
DR-GAN: Distribution Regularization for Text-to-Image Generation.DR-GAN:用于文本到图像生成的分布正则化
IEEE Trans Neural Netw Learn Syst. 2023 Dec;34(12):10309-10323. doi: 10.1109/TNNLS.2022.3165573. Epub 2023 Nov 30.
3
Word self-update contrastive adversarial networks for text-to-image synthesis.基于词自更新对比对抗网络的文本到图像合成。
Neural Netw. 2023 Oct;167:433-444. doi: 10.1016/j.neunet.2023.08.038. Epub 2023 Aug 25.
4
Generative adversarial networks with decoder-encoder output noises.生成对抗网络与解码器编码器输出噪声。
Neural Netw. 2020 Jul;127:19-28. doi: 10.1016/j.neunet.2020.04.005. Epub 2020 Apr 9.
5
DualG-GAN, a Dual-channel Generator based Generative Adversarial Network for text-to-face synthesis.基于双通道生成器的生成对抗网络 DualG-GAN 文本到人脸的合成。
Neural Netw. 2022 Nov;155:155-167. doi: 10.1016/j.neunet.2022.08.016. Epub 2022 Aug 19.
6
SAM-GAN: Self-Attention supporting Multi-stage Generative Adversarial Networks for text-to-image synthesis.SAM-GAN:用于文本到图像合成的支持多阶段生成对抗网络的自注意力模型。
Neural Netw. 2021 Jun;138:57-67. doi: 10.1016/j.neunet.2021.01.023. Epub 2021 Feb 10.
7
Multi-Sentence Auxiliary Adversarial Networks for Fine-Grained Text-to-Image Synthesis.用于细粒度文本到图像合成的多句辅助对抗网络。
IEEE Trans Image Process. 2021;30:2798-2809. doi: 10.1109/TIP.2021.3055062. Epub 2021 Feb 12.
8
Bidirectional cross-modality unsupervised domain adaptation using generative adversarial networks for cardiac image segmentation.基于生成对抗网络的双向跨模态无监督域自适应在心脏图像分割中的应用。
Comput Biol Med. 2021 Sep;136:104726. doi: 10.1016/j.compbiomed.2021.104726. Epub 2021 Aug 4.
9
Image manipulation with natural language using Two-sided Attentive Conditional Generative Adversarial Network.使用双边注意条件生成对抗网络进行自然语言指导的图像操作。
Neural Netw. 2021 Apr;136:207-217. doi: 10.1016/j.neunet.2020.09.002. Epub 2020 Sep 12.
10
StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks.StackGAN++:基于堆叠生成对抗网络的逼真图像合成
IEEE Trans Pattern Anal Mach Intell. 2019 Aug;41(8):1947-1962. doi: 10.1109/TPAMI.2018.2856256. Epub 2018 Jul 16.

引用本文的文献

1
GOYA: Leveraging Generative Art for Content-Style Disentanglement.戈雅:利用生成艺术实现内容与风格解缠
J Imaging. 2024 Jun 26;10(7):156. doi: 10.3390/jimaging10070156.
2
Fast and Efficient Design of Deep Neural Networks for Predicting N-Methylguanosine Sites Using autoBioSeqpy.使用autoBioSeqpy快速高效设计用于预测N-甲基鸟苷位点的深度神经网络
ACS Omega. 2023 May 23;8(22):19728-19740. doi: 10.1021/acsomega.3c01371. eCollection 2023 Jun 6.
3
A Review of Multi-Modal Learning from the Text-Guided Visual Processing Viewpoint.
多模态学习综述——从文本指导的视觉处理视角。
Sensors (Basel). 2022 Sep 8;22(18):6816. doi: 10.3390/s22186816.
4
CycleStyleGAN-Based Knowledge Transfer for a Machining Digital Twin.基于CycleStyleGAN的加工数字孪生知识迁移
Front Artif Intell. 2021 Nov 25;4:767451. doi: 10.3389/frai.2021.767451. eCollection 2021.