Tran Ngoc-Trung, Tran Viet-Hung, Nguyen Ngoc-Bao, Nguyen Trung-Kien, Cheung Ngai-Man
IEEE Trans Image Process. 2021;30:1882-1897. doi: 10.1109/TIP.2021.3049346. Epub 2021 Jan 20.
Recent successes in Generative Adversarial Networks (GAN) have affirmed the importance of using more data in GAN training. Yet it is expensive to collect data in many domains such as medical applications. Data Augmentation (DA) has been applied in these applications. In this work, we first argue that the classical DA approach could mislead the generator to learn the distribution of the augmented data, which could be different from that of the original data. We then propose a principled framework, termed Data Augmentation Optimized for GAN (DAG), to enable the use of augmented data in GAN training to improve the learning of the original distribution. We provide theoretical analysis to show that using our proposed DAG aligns with the original GAN in minimizing the Jensen-Shannon (JS) divergence between the original distribution and model distribution. Importantly, the proposed DAG effectively leverages the augmented data to improve the learning of discriminator and generator. We conduct experiments to apply DAG to different GAN models: unconditional GAN, conditional GAN, self-supervised GAN and CycleGAN using datasets of natural images and medical images. The results show that DAG achieves consistent and considerable improvements across these models. Furthermore, when DAG is used in some GAN models, the system establishes state-of-the-art Fréchet Inception Distance (FID) scores. Our code is available (https://github.com/tntrung/dag-gans).
生成对抗网络(GAN)最近取得的成功证实了在GAN训练中使用更多数据的重要性。然而,在许多领域(如医学应用)收集数据成本很高。数据增强(DA)已应用于这些应用中。在这项工作中,我们首先指出,经典的数据增强方法可能会误导生成器学习增强数据的分布,而这可能与原始数据的分布不同。然后,我们提出了一个有原则的框架,称为针对GAN优化的数据增强(DAG),以在GAN训练中使用增强数据来改进对原始分布的学习。我们提供理论分析表明,使用我们提出的DAG与原始GAN在最小化原始分布和模型分布之间的詹森-香农(JS)散度方面是一致的。重要的是,所提出的DAG有效地利用增强数据来改进判别器和生成器的学习。我们进行实验将DAG应用于不同的GAN模型:使用自然图像和医学图像数据集的无条件GAN、条件GAN、自监督GAN和循环GAN。结果表明,DAG在这些模型上取得了一致且显著的改进。此外,当DAG用于某些GAN模型时,该系统建立了当前最优的弗雷歇因袭距离(FID)分数。我们的代码可在(https://github.com/tntrung/dag-gans)获取。