Yang Shuai, Jiang Liming, Liu Ziwei, Loy Chen Change
IEEE Trans Pattern Anal Mach Intell. 2023 Oct;45(10):11869-11883. doi: 10.1109/TPAMI.2023.3284003. Epub 2023 Sep 5.
Recent advances in deep learning have witnessed many successful unsupervised image-to-image translation models that learn correspondences between two visual domains without paired data. However, it is still a great challenge to build robust mappings between various domains especially for those with drastic visual discrepancies. In this paper, we introduce a novel versatile framework, Generative Prior-guided UNsupervised Image-to-image Translation (GP-UNIT), that improves the quality, applicability and controllability of the existing translation models. The key idea of GP-UNIT is to distill the generative prior from pre-trained class-conditional GANs to build coarse-level cross-domain correspondences, and to apply the learned prior to adversarial translations to excavate fine-level correspondences. With the learned multi-level content correspondences, GP-UNIT is able to perform valid translations between both close domains and distant domains. For close domains, GP-UNIT can be conditioned on a parameter to determine the intensity of the content correspondences during translation, allowing users to balance between content and style consistency. For distant domains, semi-supervised learning is explored to guide GP-UNIT to discover accurate semantic correspondences that are hard to learn solely from the appearance. We validate the superiority of GP-UNIT over state-of-the-art translation models in robust, high-quality and diversified translations between various domains through extensive experiments.
深度学习的最新进展催生了许多成功的无监督图像到图像翻译模型,这些模型能够在没有配对数据的情况下学习两个视觉领域之间的对应关系。然而,在各个领域之间建立稳健的映射仍然是一个巨大的挑战,尤其是对于那些视觉差异巨大的领域。在本文中,我们引入了一种新颖的通用框架,即生成先验引导的无监督图像到图像翻译(GP-UNIT),它提高了现有翻译模型的质量、适用性和可控性。GP-UNIT的关键思想是从预训练的类别条件生成对抗网络(GAN)中提取生成先验,以建立粗粒度的跨域对应关系,并将学习到的先验应用于对抗性翻译,以挖掘细粒度的对应关系。借助学习到的多层次内容对应关系,GP-UNIT能够在相近领域和相远领域之间进行有效的翻译。对于相近领域,GP-UNIT可以根据一个参数进行调节,以确定翻译过程中内容对应关系的强度,从而让用户在内容和风格一致性之间取得平衡。对于相远领域,我们探索了半监督学习,以引导GP-UNIT发现仅从外观难以学习到的准确语义对应关系。我们通过大量实验验证了GP-UNIT在各个领域之间进行稳健、高质量和多样化翻译方面优于现有最先进的翻译模型。