IEEE Trans Pattern Anal Mach Intell. 2022 Aug;44(8):3957-3973. doi: 10.1109/TPAMI.2021.3069023. Epub 2022 Jul 1.
This paper studies the problem of learning the conditional distribution of a high-dimensional output given an input, where the output and input may belong to two different domains, e.g., the output is a photo image and the input is a sketch image. We solve this problem by cooperative training of a fast thinking initializer and slow thinking solver. The initializer generates the output directly by a non-linear transformation of the input as well as a noise vector that accounts for latent variability in the output. The slow thinking solver learns an objective function in the form of a conditional energy function, so that the output can be generated by optimizing the objective function, or more rigorously by sampling from the conditional energy-based model. We propose to learn the two models jointly, where the fast thinking initializer serves to initialize the sampling of the slow thinking solver, and the solver refines the initial output by an iterative algorithm. The solver learns from the difference between the refined output and the observed output, while the initializer learns from how the solver refines its initial output. We demonstrate the effectiveness of the proposed method on various conditional learning tasks, e.g., class-to-image generation, image-to-image translation, and image recovery. The advantage of our method over GAN-based methods is that our method is equipped with a slow thinking process that refines the solution guided by a learned objective function.
本文研究了在输入和输出属于不同域的情况下学习高维输出条件分布的问题,例如输出是一张照片图像,输入是一张草图图像。我们通过快速思考初始化器和慢速思考求解器的协同训练来解决这个问题。初始化器通过输入的非线性变换以及一个表示输出中潜在变化的噪声向量直接生成输出。慢速思考求解器学习一个以条件能量函数形式表示的目标函数,以便通过优化目标函数生成输出,或者更严格地通过从基于条件能量的模型中采样生成输出。我们建议联合学习这两个模型,其中快速思考初始化器用于初始化慢速思考求解器的采样,求解器通过迭代算法细化初始输出。求解器从精炼输出与观测输出之间的差异中学习,而初始化器从求解器如何细化其初始输出中学习。我们在各种条件学习任务上展示了所提出方法的有效性,例如类别到图像生成、图像到图像翻译和图像恢复。与基于 GAN 的方法相比,我们方法的优势在于,我们的方法配备了一个慢速思考过程,该过程在学习到的目标函数的指导下细化解决方案。