• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

言语-人物网络:姿势引导的多粒度语言到人物生成

Verbal-Person Nets: Pose-Guided Multi-Granularity Language-to-Person Generation.

作者信息

Liu Deyin, Wu Lin, Zheng Feng, Liu Lingqiao, Wang Meng

出版信息

IEEE Trans Neural Netw Learn Syst. 2023 Nov;34(11):8589-8601. doi: 10.1109/TNNLS.2022.3151631. Epub 2023 Oct 27.

DOI:10.1109/TNNLS.2022.3151631
PMID:35263259
Abstract

Person image generation conditioned on natural language allows us to personalize image editing in a user-friendly manner. This fashion, however, involves different granularities of semantic relevance between texts and visual content. Given a sentence describing an unknown person, we propose a novel pose-guided multi-granularity attention architecture to synthesize the person image in an end-to-end manner. To determine what content to draw at a global outline, the sentence-level description and pose feature maps are incorporated into a U-Net architecture to generate a coarse person image. To further enhance the fine-grained details, we propose to draw the human body parts with highly correlated textual nouns and determine the spatial positions with respect to target pose points. Our model is premised on a conditional generative adversarial network (GAN) that translates language description into a realistic person image. The proposed model is coupled with two-stream discriminators: 1) text-relevant local discriminators to improve the fine-grained appearance by identifying the region-text correspondences at the finer manipulation and 2) a global full-body discriminator to regulate the generation via a pose-weighting feature selection. Extensive experiments conducted on benchmarks validate the superiority of our method for person image generation.

摘要

基于自然语言的人物图像生成使我们能够以用户友好的方式实现图像编辑的个性化。然而,这种方式涉及文本与视觉内容之间不同粒度的语义相关性。给定一个描述未知人物的句子,我们提出了一种新颖的姿态引导多粒度注意力架构,以端到端的方式合成人物图像。为了在全局轮廓上确定绘制什么内容,将句子级描述和姿态特征图纳入U-Net架构以生成粗略的人物图像。为了进一步增强细粒度细节,我们建议用高度相关的文本名词绘制人体部位,并根据目标姿态点确定空间位置。我们的模型基于条件生成对抗网络(GAN),该网络将语言描述转换为逼真的人物图像。所提出的模型与双流鉴别器相结合:1)与文本相关的局部鉴别器,通过在更精细的操作中识别区域与文本的对应关系来改善细粒度外观;2)全局全身鉴别器,通过姿态加权特征选择来调节生成。在基准上进行的大量实验验证了我们的人物图像生成方法的优越性。

相似文献

1
Verbal-Person Nets: Pose-Guided Multi-Granularity Language-to-Person Generation.言语-人物网络:姿势引导的多粒度语言到人物生成
IEEE Trans Neural Netw Learn Syst. 2023 Nov;34(11):8589-8601. doi: 10.1109/TNNLS.2022.3151631. Epub 2023 Oct 27.
2
Image manipulation with natural language using Two-sided Attentive Conditional Generative Adversarial Network.使用双边注意条件生成对抗网络进行自然语言指导的图像操作。
Neural Netw. 2021 Apr;136:207-217. doi: 10.1016/j.neunet.2020.09.002. Epub 2020 Sep 12.
3
PoT-GAN: Pose Transform GAN for Person Image Synthesis.PoT-GAN:用于人像图像合成的姿态变换 GAN。
IEEE Trans Image Process. 2021;30:7677-7688. doi: 10.1109/TIP.2021.3104183. Epub 2021 Sep 8.
4
Word self-update contrastive adversarial networks for text-to-image synthesis.基于词自更新对比对抗网络的文本到图像合成。
Neural Netw. 2023 Oct;167:433-444. doi: 10.1016/j.neunet.2023.08.038. Epub 2023 Aug 25.
5
Improving Description-based Person Re-identification by Multi-granularity Image-text Alignments.通过多粒度图像-文本对齐改进基于描述的行人重识别
IEEE Trans Image Process. 2020 Apr 7. doi: 10.1109/TIP.2020.2984883.
6
Multi-Sentence Auxiliary Adversarial Networks for Fine-Grained Text-to-Image Synthesis.用于细粒度文本到图像合成的多句辅助对抗网络。
IEEE Trans Image Process. 2021;30:2798-2809. doi: 10.1109/TIP.2021.3055062. Epub 2021 Feb 12.
7
Person image generation through graph-based and appearance-decomposed generative adversarial network.通过基于图和外观分解的生成对抗网络生成人物图像。
PeerJ Comput Sci. 2021 Dec 24;7:e761. doi: 10.7717/peerj-cs.761. eCollection 2021.
8
CLIP-Driven Fine-Grained Text-Image Person Re-Identification.基于CLIP的细粒度文本-图像人物重识别
IEEE Trans Image Process. 2023;32:6032-6046. doi: 10.1109/TIP.2023.3327924. Epub 2023 Nov 7.
9
Pose-Driven Realistic 2-D Motion Synthesis.姿态驱动的逼真二维运动合成
IEEE Trans Cybern. 2023 Apr;53(4):2412-2425. doi: 10.1109/TCYB.2021.3120010. Epub 2023 Mar 16.
10
Unpaired Person Image Generation With Semantic Parsing Transformation.基于语义解析转换的非配对人物图像生成。
IEEE Trans Pattern Anal Mach Intell. 2021 Nov;43(11):4161-4176. doi: 10.1109/TPAMI.2020.2992105. Epub 2021 Oct 1.