Xiao Wulue, Li Jingwei, Zhang Chi, Wang Linyuan, Chen Panpan, Yu Ziya, Tong Li, Yan Bin
School of Cyber Science and Engineering, Zhengzhou University, Zhengzhou 450001, China.
Henan Key Laboratory of Imaging and Intelligent Processing, PLA Strategic Support Force Information Engineering University, Zhengzhou 450001, China.
Brain Sci. 2022 Aug 19;12(8):1101. doi: 10.3390/brainsci12081101.
Visual encoding models based on deep neural networks (DNN) show good performance in predicting brain activity in low-level visual areas. However, due to the amount of neural data limitation, DNN-based visual encoding models are difficult to fit for high-level visual areas, resulting in insufficient encoding performance. The ventral stream suggests that higher visual areas receive information from lower visual areas, which is not fully reflected in the current encoding models. In the present study, we propose a novel visual encoding model framework which uses the hierarchy of representations in the ventral stream to improve the model's performance in high-level visual areas. Under the framework, we propose two categories of hierarchical encoding models from the voxel and the feature perspectives to realize the hierarchical representations. From the voxel perspective, we first constructed an encoding model for the low-level visual area (V1 or V2) and extracted the voxel space predicted by the model. Then we use the extracted voxel space of the low-level visual area to predict the voxel space of the high-level visual area (V4 or LO) via constructing a voxel-to-voxel model. From the feature perspective, the feature space of the first model is extracted to predict the voxel space of the high-level visual area. The experimental results show that two categories of hierarchical encoding models effectively improve the encoding performance in V4 and LO. In addition, the proportion of the best-encoded voxels for different models in V4 and LO show that our proposed models have obvious advantages in prediction accuracy. We find that the hierarchy of representations in the ventral stream has a positive effect on improving the performance of the existing model in high-level visual areas.
基于深度神经网络(DNN)的视觉编码模型在预测低级视觉区域的大脑活动方面表现良好。然而,由于神经数据量的限制,基于DNN的视觉编码模型难以适用于高级视觉区域,导致编码性能不足。腹侧流表明,较高层次的视觉区域从较低层次的视觉区域接收信息,这在当前的编码模型中没有得到充分体现。在本研究中,我们提出了一种新颖的视觉编码模型框架,该框架利用腹侧流中的表征层次来提高模型在高级视觉区域的性能。在该框架下,我们从体素和特征两个角度提出了两类层次编码模型,以实现层次表征。从体素角度来看,我们首先构建了一个针对低级视觉区域(V1或V2)的编码模型,并提取该模型预测的体素空间。然后,我们通过构建一个体素到体素的模型,使用低级视觉区域提取的体素空间来预测高级视觉区域(V4或LO)的体素空间。从特征角度来看,提取第一个模型的特征空间来预测高级视觉区域的体素空间。实验结果表明,两类层次编码模型有效地提高了V4和LO区域的编码性能。此外,不同模型在V4和LO区域中最佳编码体素的比例表明,我们提出的模型在预测准确性方面具有明显优势。我们发现腹侧流中的表征层次对提高现有模型在高级视觉区域的性能具有积极作用。