Qiao Kai, Chen Jian, Wang Linyuan, Zhang Chi, Tong Li, Yan Bin
PLA Strategy Support Force Information Engineering University, Zhengzhou 450001, China.
PLA Strategy Support Force Information Engineering University, Zhengzhou 450001, China.
Neuroscience. 2020 Sep 15;444:92-105. doi: 10.1016/j.neuroscience.2020.07.040. Epub 2020 Jul 28.
In the visual decoding domain, the most difficult task is the visual reconstruction aimed at reconstructing the presented visual stimuli given the corresponding human brain activity monitored by functional magnetic resonance imaging (fMRI), especially when reconstructing viewed natural images. Recent research regarded the visual reconstruction as the conditional image generation on fMRI voxels and started to use the generative adversarial networks (GANs) to design computational models for this task. Despite the great improvement in previous GAN-based methods, the fidelity and naturalness of the reconstructed images are still unsatisfactory, the reasons include the small number of fMRI data samples and the instability of GAN training. In this study, we propose a new GAN-based Bayesian visual reconstruction model (GAN-BVRM) to avoid the contradiction between naturalness and fidelity in current GAN-based methods. GAN-BVRM is composed of a classifier to decode the categories from fMRI data, a pre-trained conditional generator of the distinguished BigGAN to generate natural images of the specified categories, and a set of encoding models and an evaluator to evaluate the generated images. Composed of neural networks, GAN-BVRM is fully differentiable and can directly generate the reconstructed images by iteratively updating the noise input vector through backpropagation to fit the fMRI voxels. In this process, the decoded categories and encoding models are responsible for the semantic and detailed contents of the reconstructed images, respectively. Experimental results revealed that GAN-BVRM improved the fidelity and naturalness, which validated the advantage of the combining of GANs and Bayesian manner for visual reconstruction.
在视觉解码领域,最具挑战性的任务是视觉重建,即根据功能磁共振成像(fMRI)监测到的相应人类大脑活动来重建呈现的视觉刺激,尤其是在重建所观看的自然图像时。最近的研究将视觉重建视为基于fMRI体素的条件图像生成,并开始使用生成对抗网络(GAN)来设计针对此任务的计算模型。尽管基于GAN的先前方法有了很大改进,但重建图像的保真度和自然度仍不尽人意,原因包括fMRI数据样本数量少以及GAN训练的不稳定性。在本研究中,我们提出了一种新的基于GAN的贝叶斯视觉重建模型(GAN-BVRM),以避免当前基于GAN的方法中自然度和保真度之间的矛盾。GAN-BVRM由一个用于从fMRI数据中解码类别的分类器、一个预训练的著名BigGAN条件生成器(用于生成指定类别的自然图像)、一组编码模型以及一个用于评估生成图像的评估器组成。由神经网络组成的GAN-BVRM是完全可微的,并且可以通过反向传播迭代更新噪声输入向量来直接生成重建图像,以拟合fMRI体素。在此过程中,解码类别和编码模型分别负责重建图像的语义和详细内容。实验结果表明,GAN-BVRM提高了保真度和自然度,验证了GAN与贝叶斯方式相结合用于视觉重建的优势。