Suppr超能文献

基于深度学习的动态自然视觉神经编码与解码

Neural Encoding and Decoding with Deep Learning for Dynamic Natural Vision.

机构信息

School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA.

Purdue Institute for Integrative Neuroscience, Purdue University, West Lafayette, IN, USA.

出版信息

Cereb Cortex. 2018 Dec 1;28(12):4136-4160. doi: 10.1093/cercor/bhx268.

Abstract

Convolutional neural network (CNN) driven by image recognition has been shown to be able to explain cortical responses to static pictures at ventral-stream areas. Here, we further showed that such CNN could reliably predict and decode functional magnetic resonance imaging data from humans watching natural movies, despite its lack of any mechanism to account for temporal dynamics or feedback processing. Using separate data, encoding and decoding models were developed and evaluated for describing the bi-directional relationships between the CNN and the brain. Through the encoding models, the CNN-predicted areas covered not only the ventral stream, but also the dorsal stream, albeit to a lesser degree; single-voxel response was visualized as the specific pixel pattern that drove the response, revealing the distinct representation of individual cortical location; cortical activation was synthesized from natural images with high-throughput to map category representation, contrast, and selectivity. Through the decoding models, fMRI signals were directly decoded to estimate the feature representations in both visual and semantic spaces, for direct visual reconstruction and semantic categorization, respectively. These results corroborate, generalize, and extend previous findings, and highlight the value of using deep learning, as an all-in-one model of the visual cortex, to understand and decode natural vision.

摘要

基于图像识别的卷积神经网络 (CNN) 已被证明能够解释腹侧流区域对静态图像的皮质反应。在这里,我们进一步表明,尽管该网络缺乏任何解释时间动态或反馈处理的机制,但它仍然可以可靠地预测和解码人类观看自然电影时的功能磁共振成像数据。我们使用单独的数据,开发并评估了编码和解码模型,以描述 CNN 和大脑之间的双向关系。通过编码模型,CNN 预测的区域不仅覆盖了腹侧流,还覆盖了背侧流,尽管程度较小;单像素响应被可视化为由特定像素模式驱动的响应,揭示了单个皮质位置的独特表示;通过高通量的自然图像合成皮质激活,以映射类别表示、对比度和选择性。通过解码模型,fMRI 信号可以直接解码,以分别估计视觉和语义空间中的特征表示,用于直接的视觉重建和语义分类。这些结果证实、推广和扩展了以前的发现,并强调了使用深度学习作为视觉皮层的一体化模型来理解和解码自然视觉的价值。

相似文献

引用本文的文献

3
Deep learning health space model for ordered responses.用于有序响应的深度学习健康空间模型。
BMC Med Inform Decis Mak. 2025 May 16;25(1):191. doi: 10.1186/s12911-025-03026-3.
7
Compression-enabled interpretability of voxelwise encoding models.基于体素编码模型的压缩增强可解释性
PLoS Comput Biol. 2025 Feb 19;21(2):e1012822. doi: 10.1371/journal.pcbi.1012822. eCollection 2025 Feb.

本文引用的文献

6
'What' Is Happening in the Dorsal Visual Pathway.背侧视觉通路中的情况。
Trends Cogn Sci. 2016 Oct;20(10):773-784. doi: 10.1016/j.tics.2016.08.003. Epub 2016 Sep 5.
7
Long-Term Recurrent Convolutional Networks for Visual Recognition and Description.长期递归卷积网络的视觉识别与描述。
IEEE Trans Pattern Anal Mach Intell. 2017 Apr;39(4):677-691. doi: 10.1109/TPAMI.2016.2599174. Epub 2016 Sep 1.
9
Visual Interpretation of Kernel-Based Prediction Models.基于核的预测模型的可视化解释。
Mol Inform. 2011 Sep;30(9):817-26. doi: 10.1002/minf.201100059. Epub 2011 Sep 5.
10
A multi-modal parcellation of human cerebral cortex.人类大脑皮层的多模态分区
Nature. 2016 Aug 11;536(7615):171-178. doi: 10.1038/nature18933. Epub 2016 Jul 20.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验