Dai Longquan, Tang Jinhui
IEEE Trans Pattern Anal Mach Intell. 2022 Aug;44(8):4151-4162. doi: 10.1109/TPAMI.2021.3062849. Epub 2022 Jul 1.
We propose iFlowGAN that learns an invertible flow (a sequence of invertible mappings) via adversarial learning and exploit it to transform a source distribution into a target distribution for unsupervised image-to-image translation. Existing GAN-based generative model such as CycleGAN [1], StarGAN [2], AGGAN [3] and CyCADA [4] needs to learn a highly under-constraint forward mapping F: X → Y from a source domain X to a target domain Y. Researchers do this by assuming there is a backward mapping B: Y → X such that x and y are fixed points of the composite functions B °F and F °B. Inspired by zero-order reverse filtering [5], we (1) understand F via contraction mappings on a metric space; (2) provide a simple yet effective algorithm to present B via the parameters of F in light of Banach fixed point theorem; (3) provide a Lipschitz-regularized network which indicates a general approach to compose the inverse for arbitrary Lipschitz-regularized networks via Banach fixed point theorem. This network is useful for image-to-image translation tasks because it could save the memory for the weights of B. Although memory can also be saved by directly coupling the weights of the forward and backward mappings, the performance of the image-to-image translation network degrades significantly. This explains why current GAN-based generative models including CycleGAN must take different parameters to compose the forward and backward mappings instead of employing the same weights to build both mappings. Taking advantage of the Lipschitz-regularized network, we not only build iFlowGAN to solve the redundancy shortcoming of CycleGAN but also assemble the corresponding iFlowGAN versions of StarGAN, AGGAN and CyCADA without breaking their network architectures. Extensive experiments show that the iFlowGAN version could produce comparable results of the original implementation while saving half parameters.
我们提出了iFlowGAN,它通过对抗学习来学习可逆流(一系列可逆映射),并利用它将源分布转换为目标分布,以进行无监督的图像到图像翻译。现有的基于GAN的生成模型,如CycleGAN [1]、StarGAN [2]、AGGAN [3]和CyCADA [4],需要学习从源域X到目标域Y的高度欠约束正向映射F: X → Y。研究人员通过假设存在反向映射B: Y → X来实现这一点,使得x和y是复合函数B °F和F °B的不动点。受零阶反向滤波[5]的启发,我们(1)通过度量空间上的压缩映射来理解F;(2)根据巴拿赫不动点定理,提供一种简单而有效的算法,通过F的参数来表示B;(3)提供一个利普希茨正则化网络,它指出了一种通过巴拿赫不动点定理为任意利普希茨正则化网络构建逆的通用方法。该网络对于图像到图像翻译任务很有用,因为它可以节省B的权重的内存。虽然也可以通过直接耦合正向和反向映射的权重来节省内存,但图像到图像翻译网络的性能会显著下降。这就解释了为什么包括CycleGAN在内的当前基于GAN的生成模型必须采用不同的参数来构建正向和反向映射,而不是使用相同的权重来构建这两个映射。利用利普希茨正则化网络,我们不仅构建了iFlowGAN来解决CycleGAN的冗余缺点,还在不破坏其网络架构的情况下,组装了StarGAN、AGGAN和CyCADA的相应iFlowGAN版本。大量实验表明,iFlowGAN版本在节省一半参数的同时,可以产生与原始实现相当的结果。