Cao Jiezhang, Guo Yong, Wu Qingyao, Shen Chunhua, Huang Junzhou, Tan Mingkui
IEEE Trans Pattern Anal Mach Intell. 2022 Jan;44(1):211-227. doi: 10.1109/TPAMI.2020.3012096. Epub 2021 Dec 7.
Generative adversarial networks (GANs) have shown remarkable success in generating realistic data from some predefined prior distribution (e.g., Gaussian noises). However, such prior distribution is often independent of real data and thus may lose semantic information (e.g., geometric structure or content in images) of data. In practice, the semantic information might be represented by some latent distribution learned from data. However, such latent distribution may incur difficulties in data sampling for GAN methods. In this paper, rather than sampling from the predefined prior distribution, we propose a GAN model with local coordinate coding (LCC), termed LCCGAN, to improve the performance of the image generation. First, we propose an LCC sampling method in LCCGAN to sample meaningful points from the latent manifold. With the LCC sampling method, we can explicitly exploit the local information on the latent manifold and thus produce new data with promising quality. Second, we propose an improved version, namely LCCGAN++, by introducing a higher-order term in the generator approximation. This term is able to achieve better approximation and thus further improve the performance. More critically, we derive the generalization bound for both LCCGAN and LCCGAN++ and prove that a low-dimensional input is sufficient to achieve good generalization performance. Extensive experiments on several benchmark datasets demonstrate the superiority of the proposed method over existing GAN methods.
生成对抗网络(GAN)在从某些预定义的先验分布(例如高斯噪声)生成逼真的数据方面已取得显著成功。然而,这种先验分布通常与真实数据无关,因此可能会丢失数据的语义信息(例如图像中的几何结构或内容)。在实践中,语义信息可能由从数据中学到的一些潜在分布来表示。然而,这种潜在分布可能会给GAN方法的数据采样带来困难。在本文中,我们提出了一种具有局部坐标编码(LCC)的GAN模型,称为LCCGAN,以提高图像生成的性能,而不是从预定义的先验分布中采样。首先,我们在LCCGAN中提出了一种LCC采样方法,以便从潜在流形中采样有意义的点。通过LCC采样方法,我们可以明确地利用潜在流形上的局部信息,从而生成具有良好质量的新数据。其次,我们通过在生成器近似中引入一个高阶项,提出了一个改进版本,即LCCGAN++。该项能够实现更好的近似,从而进一步提高性能。更关键的是,我们推导了LCCGAN和LCCGAN++的泛化界,并证明低维输入足以实现良好的泛化性能。在几个基准数据集上进行的大量实验证明了所提出的方法优于现有的GAN方法。