The Clinical Hospital of Chengdu Brain Science Institute, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, People's Republic of China.
MOE Key Lab for Neuroinformation; High-Field Magnetic Resonance Brain Imaging Key Laboratory of Sichuan Province, University of Electronic Science and Technology of China, Chengdu 610054, People's Republic of China.
J Neural Eng. 2020 Oct 13;17(5):056013. doi: 10.1088/1741-2552/abb691.
Visual perception decoding plays an important role in understanding our visual systems. Recent functional magnetic resonance imaging (fMRI) studies have made great advances in predicting the visual content of the single stimulus from the evoked response. In this work, we proposed a novel framework to extend previous works by simultaneously decoding the temporal and category information of visual stimuli from fMRI activities.
3 T fMRI data of five volunteers were acquired while they were viewing five categories of natural images with random presentation intervals. For each subject, we trained two classification-based decoding modules that were used to identify the occurrence time and semantic categories of the visual stimuli. In each module, we adopted recurrent neural network (RNN), which has proven to be highly effective for learning nonlinear representations from sequential data, for the analysis of the temporal dynamics of fMRI activity patterns. Finally, we integrated the two modules into a complete framework.
The proposed framework shows promising decoding performance. The average decoding accuracy across five subjects was over 19 times the chance level. Moreover, we compared the decoding performance of the early visual cortex (eVC) and the high-level visual cortex (hVC). The comparison results indicated that both eVC and hVC participated in processing visual stimuli, but the semantic information of the visual stimuli was mainly represented in hVC.
The proposed framework advances the decoding of visual experiences and facilitates a better understanding of our visual functions.
视觉感知解码在理解我们的视觉系统方面起着重要作用。最近的功能磁共振成像(fMRI)研究在从诱发反应中预测单一刺激的视觉内容方面取得了重大进展。在这项工作中,我们提出了一个新的框架,通过从 fMRI 活动中同时解码视觉刺激的时间和类别信息来扩展以前的工作。
在 5 名志愿者观看随机呈现间隔的 5 类自然图像时,采集了 3T fMRI 数据。对于每个受试者,我们训练了两个基于分类的解码模块,用于识别视觉刺激的发生时间和语义类别。在每个模块中,我们采用了已经被证明对从序列数据中学习非线性表示非常有效的递归神经网络(RNN),用于分析 fMRI 活动模式的时间动态。最后,我们将两个模块集成到一个完整的框架中。
所提出的框架显示出有希望的解码性能。五个受试者的平均解码准确率超过了随机水平的 19 倍。此外,我们比较了早期视觉皮层(eVC)和高级视觉皮层(hVC)的解码性能。比较结果表明,eVC 和 hVC 都参与了视觉刺激的处理,但视觉刺激的语义信息主要在 hVC 中表示。
所提出的框架推进了视觉体验的解码,有助于更好地理解我们的视觉功能。