Suppr超能文献

基于标签引导的生成对抗网络的真实感图像合成。

Label-Guided Generative Adversarial Network for Realistic Image Synthesis.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):3311-3328. doi: 10.1109/TPAMI.2022.3186752. Epub 2023 Feb 3.

Abstract

Generating photo-realistic images from labels (e.g., semantic labels or sketch labels) is much more challenging than the general image-to-image translation task, mainly due to the large differences between extremely sparse labels and detail rich images. We propose a general framework Lab2Pix to tackle this issue from two aspects: 1) how to extract useful information from the input; and 2) how to efficiently bridge the gap between the labels and images. Specifically, we propose a Double-Guided Normalization (DG-Norm) to use the input label for semantically guiding activations in normalization layers, and use global features with large receptive fields for differentiating the activations within the same semantic region. To efficiently generate the images, we further propose Label Guided Spatial Co-Attention (LSCA) to encourage the learning of incremental visual information using limited model parameters while storing the well-synthesized part in lower-level features. Accordingly, Hierarchical Perceptual Discriminators with Foreground Enhancement Masks are proposed to toughly work against the generator thus encouraging realistic image generation and a sharp enhancement loss is further introduced for high-quality sharp image generation. We instantiate our Lab2Pix for the task of label-to-image in both unpaired (Lab2Pix-V1) and paired settings (Lab2Pix-V2). Extensive experiments conducted on various datasets demonstrate that our method significantly outperforms state-of-the-art methods quantitatively and qualitatively in both settings.

摘要

从标签(例如语义标签或草图标签)生成逼真的图像比一般的图像到图像翻译任务更具挑战性,主要是因为标签极其稀疏而图像细节丰富之间存在巨大差异。我们提出了一个通用框架 Lab2Pix 来从两个方面解决这个问题:1)如何从输入中提取有用信息;2)如何有效地弥合标签和图像之间的差距。具体来说,我们提出了一种双引导归一化(DG-Norm),用于使用输入标签对归一化层中的激活进行语义引导,并使用具有大感受野的全局特征来区分同一语义区域内的激活。为了有效地生成图像,我们进一步提出了标签引导空间协同注意(LSCA),以在使用有限的模型参数的同时使用有限的模型参数鼓励学习增量视觉信息,同时将合成良好的部分存储在较低级别的特征中。相应地,提出了具有前景增强掩模的分层感知鉴别器,以对生成器进行严格的对抗,从而鼓励真实的图像生成,并进一步引入了尖锐增强损失,以生成高质量的尖锐图像。我们在无配对(Lab2Pix-V1)和配对设置(Lab2Pix-V2)中分别为标签到图像的任务实例化了我们的 Lab2Pix。在各种数据集上进行的广泛实验表明,我们的方法在这两种设置下在数量和质量上都明显优于最先进的方法。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验