IEEE Trans Neural Netw Learn Syst. 2022 Feb;33(2):600-614. doi: 10.1109/TNNLS.2020.3028167. Epub 2022 Feb 3.
The reconstruction of visual information from human brain activity is a very important research topic in brain decoding. Existing methods ignore the structural information underlying the brain activities and the visual features, which severely limits their performance and interpretability. Here, we propose a hierarchically structured neural decoding framework by using multitask transfer learning of deep neural network (DNN) representations and a matrix-variate Gaussian prior. Our framework consists of two stages, Voxel2Unit and Unit2Pixel. In Voxel2Unit, we decode the functional magnetic resonance imaging (fMRI) data to the intermediate features of a pretrained convolutional neural network (CNN). In Unit2Pixel, we further invert the predicted CNN features back to the visual images. Matrix-variate Gaussian prior allows us to take into account the structures between feature dimensions and between regression tasks, which are useful for improving decoding effectiveness and interpretability. This is in contrast with the existing single-output regression models that usually ignore these structures. We conduct extensive experiments on two real-world fMRI data sets, and the results show that our method can predict CNN features more accurately and reconstruct the perceived natural images and faces with higher quality.
从人类大脑活动中重建视觉信息是脑解码中一个非常重要的研究课题。现有的方法忽略了大脑活动和视觉特征的基础结构信息,这严重限制了它们的性能和可解释性。在这里,我们提出了一种层次结构的神经解码框架,通过使用深度神经网络(DNN)表示的多任务迁移学习和矩阵变量高斯先验。我们的框架由两个阶段组成,Voxel2Unit 和 Unit2Pixel。在 Voxel2Unit 中,我们将功能磁共振成像(fMRI)数据解码为预训练卷积神经网络(CNN)的中间特征。在 Unit2Pixel 中,我们进一步将预测的 CNN 特征反向映射到视觉图像。矩阵变量高斯先验允许我们考虑特征维度之间和回归任务之间的结构,这对于提高解码效果和可解释性很有用。这与现有的单输出回归模型形成对比,后者通常忽略了这些结构。我们在两个真实 fMRI 数据集上进行了广泛的实验,结果表明,我们的方法可以更准确地预测 CNN 特征,并以更高的质量重建感知的自然图像和人脸。