The MOE Key Lab for Neuroinformation, University of Electronic Science and Technology of China, Chengdu, People's Republic of China.
State Key Laboratory of Brain and Cognitive Science, Beijing MR Center for Brain Research, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China.
Hum Brain Mapp. 2020 Oct 15;41(15):4442-4453. doi: 10.1002/hbm.25136. Epub 2020 Jul 10.
Visual perceptual decoding is one of the important and challenging topics in cognitive neuroscience. Building a mapping model between visual response signals and visual contents is the key point of decoding. Most previous studies used peak response signals to decode object categories. However, brain activities measured by functional magnetic resonance imaging are a dynamic process with time dependence, so peak signals cannot fully represent the whole process, which may affect the performance of decoding. Here, we propose a decoding model based on long short-term memory (LSTM) network to decode five object categories from multitime response signals evoked by natural images. Experimental results show that the average decoding accuracy using the multitime (2-6 s) response signals is 0.540 from the five subjects, which is significantly higher than that using the peak ones (6 s; accuracy: 0.492; p < .05). In addition, from the perspective of different durations, methods and visual areas, the decoding performances of the five object categories are deeply and comprehensively explored. The analysis of different durations and decoding methods reveals that the LSTM-based decoding model with sequence simulation ability can fit the time dependence of the multitime visual response signals to achieve higher decoding performance. The comparative analysis of different visual areas demonstrates that the higher visual cortex (VC) contains more semantic category information needed for visual perceptual decoding than lower VC.
视觉感知解码是认知神经科学中的一个重要且具有挑战性的课题。构建视觉反应信号与视觉内容之间的映射模型是解码的关键。大多数先前的研究都使用峰响应信号来解码物体类别。然而,功能磁共振成像测量的大脑活动是一个具有时间依赖性的动态过程,因此峰信号不能完全代表整个过程,这可能会影响解码的性能。在这里,我们提出了一种基于长短期记忆(LSTM)网络的解码模型,用于从自然图像诱发的多时间响应信号中解码五个物体类别。实验结果表明,来自五个被试者的多时间(2-6 秒)响应信号的平均解码准确率为 0.540,明显高于使用峰信号(6 秒;准确率:0.492;p<.05)的准确率。此外,从不同持续时间、方法和视觉区域的角度,深入全面地探讨了这五个物体类别的解码性能。不同持续时间和解码方法的分析表明,具有序列模拟能力的基于 LSTM 的解码模型可以拟合多时间视觉响应信号的时间依赖性,从而实现更高的解码性能。不同视觉区域的对比分析表明,较高的视觉皮层(VC)比较低的 VC 包含更多用于视觉感知解码的语义类别信息。