Wu Jiasong, Qiu Xiang, Zhang Jing, Wu Fuzhi, Kong Youyong, Yang Guanyu, Senhadji Lotfi, Shu Huazhong
Laboratory of Image Science and Technology, Key Laboratory of Computer Network and Information Integration, Southeast University, Ministry of Education, Nanjing, China.
Jiangsu Provincial Joint International Research Laboratory of Medical Information Processing, School of Computer Science and Engineering, Southeast University, Nanjing, China.
Front Neurorobot. 2021 Oct 26;15:752752. doi: 10.3389/fnbot.2021.752752. eCollection 2021.
Generative adversarial networks and variational autoencoders (VAEs) provide impressive image generation from Gaussian white noise, but both are difficult to train, since they need a generator (or encoder) and a discriminator (or decoder) to be trained simultaneously, which can easily lead to unstable training. To solve or alleviate these synchronous training problems of generative adversarial networks (GANs) and VAEs, researchers recently proposed generative scattering networks (GSNs), which use wavelet scattering networks (ScatNets) as the encoder to obtain features (or ScatNet embeddings) and convolutional neural networks (CNNs) as the decoder to generate an image. The advantage of GSNs is that the parameters of ScatNets do not need to be learned, while the disadvantage of GSNs is that their ability to obtain representations of ScatNets is slightly weaker than that of CNNs. In addition, the dimensionality reduction method of principal component analysis (PCA) can easily lead to overfitting in the training of GSNs and, therefore, affect the quality of generated images in the testing process. To further improve the quality of generated images while keeping the advantages of GSNs, this study proposes generative fractional scattering networks (GFRSNs), which use more expressive fractional wavelet scattering networks (FrScatNets), instead of ScatNets as the encoder to obtain features (or FrScatNet embeddings) and use similar CNNs of GSNs as the decoder to generate an image. Additionally, this study develops a new dimensionality reduction method named feature-map fusion (FMF) instead of performing PCA to better retain the information of FrScatNets,; it also discusses the effect of image fusion on the quality of the generated image. The experimental results obtained on the CIFAR-10 and CelebA datasets show that the proposed GFRSNs can lead to better generated images than the original GSNs on testing datasets. The experimental results of the proposed GFRSNs with deep convolutional GAN (DCGAN), progressive GAN (PGAN), and CycleGAN are also given.
生成对抗网络和变分自编码器(VAEs)能从高斯白噪声中生成令人印象深刻的图像,但两者都难以训练,因为它们需要同时训练生成器(或编码器)和判别器(或解码器),这很容易导致训练不稳定。为了解决或缓解生成对抗网络(GANs)和VAEs的这些同步训练问题,研究人员最近提出了生成散射网络(GSNs),它使用小波散射网络(ScatNets)作为编码器来获取特征(或ScatNet嵌入),并使用卷积神经网络(CNNs)作为解码器来生成图像。GSNs的优点是ScatNets的参数无需学习,而其缺点是GSNs获取ScatNets表示的能力略弱于CNNs。此外,主成分分析(PCA)的降维方法在GSNs训练中容易导致过拟合,进而在测试过程中影响生成图像的质量。为了在保持GSNs优点的同时进一步提高生成图像的质量,本研究提出了生成分数散射网络(GFRSNs),它使用更具表现力的分数小波散射网络(FrScatNets),而不是ScatNets作为编码器来获取特征(或FrScatNet嵌入),并使用与GSNs类似的CNNs作为解码器来生成图像。此外,本研究开发了一种名为特征图融合(FMF)的新降维方法,而不是执行PCA,以更好地保留FrScatNets的信息;还讨论了图像融合对生成图像质量的影响。在CIFAR - 10和CelebA数据集上获得的实验结果表明,所提出的GFRSNs在测试数据集上能生成比原始GSNs更好的图像。还给出了所提出的GFRSNs与深度卷积GAN(DCGAN)、渐进GAN(PGAN)和循环GAN(CycleGAN)的实验结果。