变分自编码器：一种用于对视觉皮层的 fMRI 活动进行编码和解码的无监督模型。

Variational autoencoder: An unsupervised model for encoding and decoding fMRI activity in visual cortex.

机构信息

School of Electrical and Computer Engineering, USA; Purdue Institute for Integrative Neuroscience, Purdue University, West Lafayette, IN, 47906, USA.

Weldon School of Biomedical Engineering, USA; School of Electrical and Computer Engineering, USA; Purdue Institute for Integrative Neuroscience, Purdue University, West Lafayette, IN, 47906, USA.

出版信息

Neuroimage. 2019 Sep;198:125-136. doi: 10.1016/j.neuroimage.2019.05.039. Epub 2019 May 16.

DOI:10.1016/j.neuroimage.2019.05.039

PMID:31103784

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6592726/

Abstract

Goal-driven and feedforward-only convolutional neural networks (CNN) have been shown to be able to predict and decode cortical responses to natural images or videos. Here, we explored an alternative deep neural network, variational auto-encoder (VAE), as a computational model of the visual cortex. We trained a VAE with a five-layer encoder and a five-layer decoder to learn visual representations from a diverse set of unlabeled images. Using the trained VAE, we predicted and decoded cortical activity observed with functional magnetic resonance imaging (fMRI) from three human subjects passively watching natural videos. Compared to CNN, VAE could predict the video-evoked cortical responses with comparable accuracy in early visual areas, but relatively lower accuracy in higher-order visual areas. The distinction between CNN and VAE in terms of encoding performance was primarily attributed to their different learning objectives, rather than their different model architecture or number of parameters. Despite lower encoding accuracies, VAE offered a more convenient strategy for decoding the fMRI activity to reconstruct the video input, by first converting the fMRI activity to the VAE's latent variables, and then converting the latent variables to the reconstructed video frames through the VAE's decoder. This strategy was more advantageous than alternative decoding methods, e.g. partial least squares regression, for being able to reconstruct both the spatial structure and color of the visual input. Such findings highlight VAE as an unsupervised model for learning visual representation, as well as its potential and limitations for explaining cortical responses and reconstructing naturalistic and diverse visual experiences.

摘要

基于目标和前馈的卷积神经网络（CNN）已被证明能够预测和解码皮质对自然图像或视频的反应。在这里，我们探索了一种替代的深度神经网络，变分自编码器（VAE），作为视觉皮质的计算模型。我们用一个五层层编码器和一个五层层解码器训练一个 VAE，从一组不同的无标签图像中学习视觉表示。使用训练好的 VAE，我们预测并解码了三个被动观看自然视频的人类被试的功能磁共振成像（fMRI）观察到的皮质活动。与 CNN 相比，VAE 可以在早期视觉区域以相当的准确性预测视频诱发的皮质反应，但在更高阶的视觉区域的准确性相对较低。VAE 和 CNN 在编码性能上的区别主要归因于它们不同的学习目标，而不是它们不同的模型结构或参数数量。尽管编码精度较低，但 VAE 提供了一种更方便的策略来解码 fMRI 活动以重建视频输入，首先将 fMRI 活动转换为 VAE 的潜在变量，然后通过 VAE 的解码器将潜在变量转换为重建的视频帧。与替代的解码方法（例如偏最小二乘回归）相比，这种策略更有利于重建视觉输入的空间结构和颜色。这些发现突出了 VAE 作为学习视觉表示的无监督模型的作用，以及它在解释皮质反应和重建自然和多样化的视觉体验方面的潜力和局限性。

相似文献

Variational autoencoder: An unsupervised model for encoding and decoding fMRI activity in visual cortex.

Neuroimage. 2019 Sep;198:125-136. doi: 10.1016/j.neuroimage.2019.05.039. Epub 2019 May 16.

Neural Encoding and Decoding with Deep Learning for Dynamic Natural Vision.

Cereb Cortex. 2018 Dec 1;28(12):4136-4160. doi: 10.1093/cercor/bhx268.

Reconstructing faces from fMRI patterns using deep generative neural networks.

Commun Biol. 2019 May 21;2:193. doi: 10.1038/s42003-019-0438-y. eCollection 2019.

Reconstruction of natural visual scenes from neural spikes with deep neural networks.

Neural Netw. 2020 May;125:19-30. doi: 10.1016/j.neunet.2020.01.033. Epub 2020 Feb 8.

Transfer learning of deep neural network representations for fMRI decoding.

J Neurosci Methods. 2019 Dec 1;328:108319. doi: 10.1016/j.jneumeth.2019.108319. Epub 2019 Oct 1.

A self-supervised deep neural network for image completion resembles early visual cortex fMRI activity patterns for occluded scenes.

J Vis. 2021 Jul 6;21(7):5. doi: 10.1167/jov.21.7.5.

Convolutional neural network-based encoding and decoding of visual object recognition in space and time.

Neuroimage. 2018 Oct 15;180(Pt A):253-266. doi: 10.1016/j.neuroimage.2017.07.018. Epub 2017 Jul 16.

Reconstructing visual experiences from brain activity evoked by natural movies.

Curr Biol. 2011 Oct 11;21(19):1641-6. doi: 10.1016/j.cub.2011.08.031. Epub 2011 Sep 22.

Reconstructing seen image from brain activity by visually-guided cognitive representation and adversarial learning.

Neuroimage. 2021 Mar;228:117602. doi: 10.1016/j.neuroimage.2020.117602. Epub 2021 Jan 1.

Inter-individual deep image reconstruction via hierarchical neural code conversion.

Neuroimage. 2023 May 1;271:120007. doi: 10.1016/j.neuroimage.2023.120007. Epub 2023 Mar 11.

引用本文的文献

Compression-enabled interpretability of voxelwise encoding models.

PLoS Comput Biol. 2025 Feb 19;21(2):e1012822. doi: 10.1371/journal.pcbi.1012822. eCollection 2025 Feb.

Improved image reconstruction from brain activity through automatic image captioning.

Sci Rep. 2025 Feb 10;15(1):4907. doi: 10.1038/s41598-025-89242-3.

Label-free photoacoustic computed tomography of visually evoked responses in the primary visual cortex and four subcortical retinorecipient nuclei of anesthetized mice.

Neurophotonics. 2024 Jul;11(3):035005. doi: 10.1117/1.NPh.11.3.035005. Epub 2024 Jul 30.

Modeling short visual events through the BOLD moments video fMRI dataset and metadata.

Nat Commun. 2024 Jul 24;15(1):6241. doi: 10.1038/s41467-024-50310-3.

Intelligent oncology: The convergence of artificial intelligence and oncology.

J Natl Cancer Cent. 2022 Dec 5;3(1):83-91. doi: 10.1016/j.jncc.2022.11.004. eCollection 2023 Mar.

Application of a variational autoencoder for clustering and analyzing in situ articular cartilage cellular response to mechanical stimuli.

PLoS One. 2024 May 20;19(5):e0297947. doi: 10.1371/journal.pone.0297947. eCollection 2024.

Brain2GAN: Feature-disentangled neural encoding and decoding of visual perception in the primate brain.

PLoS Comput Biol. 2024 May 6;20(5):e1012058. doi: 10.1371/journal.pcbi.1012058. eCollection 2024 May.

Using unsupervised capsule neural network reveal spatial representations in the human brain.

Hum Brain Mapp. 2024 Apr;45(5):e26573. doi: 10.1002/hbm.26573.

Disentangled deep generative models reveal coding principles of the human face processing network.

PLoS Comput Biol. 2024 Feb 26;20(2):e1011887. doi: 10.1371/journal.pcbi.1011887. eCollection 2024 Feb.

Dual-Guided Brain Diffusion Model: Natural Image Reconstruction from Human Visual Stimulus fMRI.

Bioengineering (Basel). 2023 Sep 24;10(10):1117. doi: 10.3390/bioengineering10101117.

本文引用的文献

Deep image reconstruction from human brain activity.

PLoS Comput Biol. 2019 Jan 14;15(1):e1006633. doi: 10.1371/journal.pcbi.1006633. eCollection 2019 Jan.

Reconstructing Perceived Images From Human Brain Activities With Bayesian Deep Multiview Learning.

IEEE Trans Neural Netw Learn Syst. 2019 Aug;30(8):2310-2323. doi: 10.1109/TNNLS.2018.2882456. Epub 2018 Dec 12.

Generative adversarial networks for reconstructing natural images from brain activity.

Neuroimage. 2018 Nov 1;181:775-785. doi: 10.1016/j.neuroimage.2018.07.043. Epub 2018 Jul 20.

Deep recurrent neural network reveals a hierarchy of process memory during dynamic natural vision.

Hum Brain Mapp. 2018 May;39(5):2269-2282. doi: 10.1002/hbm.24006. Epub 2018 Feb 12.

Neural Encoding and Decoding with Deep Learning for Dynamic Natural Vision.

Cereb Cortex. 2018 Dec 1;28(12):4136-4160. doi: 10.1093/cercor/bhx268.

Unsupervised learning of phase transitions: From principal component analysis to variational autoencoders.

Phys Rev E. 2017 Aug;96(2-1):022140. doi: 10.1103/PhysRevE.96.022140. Epub 2017 Aug 18.

Convolutional neural network-based encoding and decoding of visual object recognition in space and time.

Neuroimage. 2018 Oct 15;180(Pt A):253-266. doi: 10.1016/j.neuroimage.2017.07.018. Epub 2017 Jul 16.

Deep Neural Networks: A New Framework for Modeling Biological Vision and Brain Information Processing.

Annu Rev Vis Sci. 2015 Nov 24;1:417-446. doi: 10.1146/annurev-vision-082114-035447.

Generic decoding of seen and imagined objects using hierarchical visual features.

Nat Commun. 2017 May 22;8:15037. doi: 10.1038/ncomms15037.

A Hierarchical Predictive Coding Model of Object Recognition in Natural Images.

Cognit Comput. 2017;9(2):151-167. doi: 10.1007/s12559-016-9445-1. Epub 2016 Dec 28.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

变分自编码器：一种用于对视觉皮层的 fMRI 活动进行编码和解码的无监督模型。

Variational autoencoder: An unsupervised model for encoding and decoding fMRI activity in visual cortex.

机构信息

School of Electrical and Computer Engineering, USA; Purdue Institute for Integrative Neuroscience, Purdue University, West Lafayette, IN, 47906, USA.

Weldon School of Biomedical Engineering, USA; School of Electrical and Computer Engineering, USA; Purdue Institute for Integrative Neuroscience, Purdue University, West Lafayette, IN, 47906, USA.

出版信息

Neuroimage. 2019 Sep;198:125-136. doi: 10.1016/j.neuroimage.2019.05.039. Epub 2019 May 16.

DOI:10.1016/j.neuroimage.2019.05.039

PMID:31103784

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6592726/

Abstract

摘要

变分自编码器：一种用于对视觉皮层的 fMRI 活动进行编码和解码的无监督模型。

Variational autoencoder: An unsupervised model for encoding and decoding fMRI activity in visual cortex.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

变分自编码器：一种用于对视觉皮层的 fMRI 活动进行编码和解码的无监督模型。

Variational autoencoder: An unsupervised model for encoding and decoding fMRI activity in visual cortex.

机构信息

出版信息