第二视觉：使用大脑优化编码模型使图像分布与人类大脑活动相匹配。

Second Sight: Using brain-optimized encoding models to align image distributions with human brain activity.

作者信息

Kneeland Reese, Ojeda Jordyn, St-Yves Ghislain, Naselaris Thomas

机构信息

Department of Computer Science, University of Minnesota, Minneapolis MN, 55455.

Department of Neuroscience, University of Minnesota, Minneapolis MN, 55455.

出版信息

ArXiv. 2023 Jun 1:arXiv:2306.00927v1.

PMID:37396609

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10312906/

Abstract

Two recent developments have accelerated progress in image reconstruction from human brain activity: large datasets that offer samples of brain activity in response to many thousands of natural scenes, and the open-sourcing of powerful stochastic image-generators that accept both low- and high-level guidance. Most work in this space has focused on obtaining point estimates of the target image, with the ultimate goal of approximating literal pixel-wise reconstructions of target images from the brain activity patterns they evoke. This emphasis belies the fact that there is always a family of images that are equally compatible with any evoked brain activity pattern, and the fact that many image-generators are inherently stochastic and do not by themselves offer a method for selecting the single best reconstruction from among the samples they generate. We introduce a novel reconstruction procedure (Second Sight) that iteratively refines an image distribution to explicitly maximize the alignment between the predictions of a voxel-wise encoding model and the brain activity patterns evoked by any target image. We use an ensemble of brain-optimized deep neural networks trained on the Natural Scenes Dataset (NSD) as our encoding model, and a latent diffusion model as our image generator. At each iteration, we generate a small library of images and select those that best approximate the measured brain activity when passed through our encoding model. We extract semantic and structural guidance from the selected images, used for generating the next library. We show that this process converges on a distribution of high-quality reconstructions by refining both semantic content and low-level image details across iterations. Images sampled from these converged image distributions are competitive with state-of-the-art reconstruction algorithms. Interestingly, the time-to-convergence varies systematically across visual cortex, with earlier visual areas generally taking longer and converging on narrower image distributions, relative to higher-level brain areas. Second Sight thus offers a succinct and novel method for exploring the diversity of representations across visual brain areas.

摘要

最近的两项进展加速了从人类大脑活动进行图像重建的进程

一是提供了响应数千个自然场景的大脑活动样本的大型数据集，二是强大的随机图像生成器的开源，这些生成器可接受低级和高级指导。该领域的大多数工作都集中在获得目标图像的点估计上，其最终目标是根据大脑活动模式唤起的目标图像进行逐像素的文字重建。这种强调掩盖了这样一个事实，即总是存在一系列与任何唤起的大脑活动模式同样兼容的图像，而且许多图像生成器本质上是随机的，它们本身并没有提供一种从它们生成的样本中选择单一最佳重建的方法。我们引入了一种新颖的重建程序（Second Sight），该程序迭代地优化图像分布，以明确最大化体素编码模型的预测与任何目标图像唤起的大脑活动模式之间的对齐。我们使用在自然场景数据集（NSD）上训练的一组经过大脑优化的深度神经网络作为我们的编码模型，并使用潜在扩散模型作为我们的图像生成器。在每次迭代中，我们生成一个小的图像库，并选择那些在通过我们的编码模型时最接近测量到的大脑活动的图像。我们从选定的图像中提取语义和结构指导，用于生成下一个图像库。我们表明，通过在迭代过程中优化语义内容和低级图像细节，这个过程会收敛到高质量重建的分布上。从这些收敛的图像分布中采样的图像与最先进的重建算法具有竞争力。有趣的是，收敛时间在视觉皮层中系统地变化，相对于高级脑区，早期视觉区域通常需要更长时间并收敛到更窄的图像分布上。因此，Second Sight提供了一种简洁新颖的方法来探索视觉脑区表征的多样性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1859/10312906/8f4cd0c97838/nihpp-2306.00927v1-f0006.jpg

相似文献

Second Sight: Using brain-optimized encoding models to align image distributions with human brain activity.第二视觉：使用大脑优化编码模型使图像分布与人类大脑活动相匹配。

ArXiv. 2023 Jun 1:arXiv:2306.00927v1.

Brain-optimized inference improves reconstructions of fMRI brain activity.大脑优化推理改善了功能磁共振成像大脑活动的重建。

ArXiv. 2023 Dec 12:arXiv:2312.07705v1.

Reconstructing seen images from human brain activity via guided stochastic search.通过引导式随机搜索从人类大脑活动中重建可见图像。

ArXiv. 2023 May 2:arXiv:2305.00556v2.

A voxel-wise encoding model for early visual areas decodes mental images of remembered scenes.一种用于早期视觉区域的体素编码模型可解码记忆场景的心理图像。

Neuroimage. 2015 Jan 15;105:215-28. doi: 10.1016/j.neuroimage.2014.10.018. Epub 2014 Oct 29.

Deep Natural Image Reconstruction from Human Brain Activity Based on Conditional Progressively Growing Generative Adversarial Networks.基于条件渐进式生成对抗网络的人类大脑活动的深度自然图像重建。

Neurosci Bull. 2021 Mar;37(3):369-379. doi: 10.1007/s12264-020-00613-4. Epub 2020 Nov 22.

Deep image reconstruction from human brain activity.从人类大脑活动中进行深度图像重建。

PLoS Comput Biol. 2019 Jan 14;15(1):e1006633. doi: 10.1371/journal.pcbi.1006633. eCollection 2019 Jan.

A Texture Statistics Encoding Model Reveals Hierarchical Feature Selectivity across Human Visual Cortex.纹理统计编码模型揭示了人类视觉皮层的层次特征选择性。

J Neurosci. 2023 May 31;43(22):4144-4161. doi: 10.1523/JNEUROSCI.1822-22.2023. Epub 2023 May 1.

Optimal compressed sensing reconstructions of fMRI using 2D deterministic and stochastic sampling geometries.使用二维确定性和随机采样几何结构对 fMRI 进行最佳压缩感知重建。

Biomed Eng Online. 2012 May 20;11:25. doi: 10.1186/1475-925X-11-25.

Mapping multidimensional content representations to neural and behavioral expressions of episodic memory.将多维内容表示映射到情景记忆的神经和行为表达上。

Neuroimage. 2023 Aug 15;277:120222. doi: 10.1016/j.neuroimage.2023.120222. Epub 2023 Jun 14.

Retrieving and reconstructing conceptually similar images from fMRI with latent diffusion models and a neuro-inspired brain decoding model.使用潜在扩散模型和神经启发式脑解码模型从功能磁共振成像中检索和重建概念上相似的图像。

J Neural Eng. 2024 Jun 28;21(4). doi: 10.1088/1741-2552/ad593c.

本文引用的文献

Stacked regressions and structured variance partitioning for interpretable brain maps.堆叠回归和结构方差分解可用于可解释的脑图谱。

Neuroimage. 2024 Sep;298:120772. doi: 10.1016/j.neuroimage.2024.120772. Epub 2024 Aug 6.

Natural scene reconstruction from fMRI signals using generative latent diffusion.基于生成式潜在扩散模型从 fMRI 信号中重建自然场景

Sci Rep. 2023 Sep 20;13(1):15666. doi: 10.1038/s41598-023-42891-8.

Generative Adversarial Networks Conditioned on Brain Activity Reconstruct Seen Images.基于大脑活动重建所见图像的生成对抗网络。

Conf Proc IEEE Int Conf Syst Man Cybern. 2018 Oct;2018:1054-1061. doi: 10.1109/SMC.2018.00187. Epub 2019 Jan 17.

Brain-optimized deep neural network models of human visual areas learn non-hierarchical representations.大脑优化的人类视觉区域深度神经网络模型学习非层次化的表示。

Nat Commun. 2023 Jun 7;14(1):3329. doi: 10.1038/s41467-023-38674-4.

A massive 7T fMRI dataset to bridge cognitive neuroscience and artificial intelligence.一个用于连接认知神经科学与人工智能的大规模7T功能磁共振成像数据集。

Nat Neurosci. 2022 Jan;25(1):116-126. doi: 10.1038/s41593-021-00962-x. Epub 2021 Dec 16.

Compressive spatial summation in human visual cortex.人类视觉皮层中的压缩空间总和。

J Neurophysiol. 2013 Jul;110(2):481-94. doi: 10.1152/jn.00105.2013. Epub 2013 Apr 24.

Encoding and decoding in fMRI.功能磁共振成像中的编码和解码。

Neuroimage. 2011 May 15;56(2):400-10. doi: 10.1016/j.neuroimage.2010.07.073. Epub 2010 Aug 4.

Bayesian reconstruction of natural images from human brain activity.基于人类大脑活动的自然图像贝叶斯重建。

Neuron. 2009 Sep 24;63(6):902-15. doi: 10.1016/j.neuron.2009.09.006.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

第二视觉：使用大脑优化编码模型使图像分布与人类大脑活动相匹配。

Second Sight: Using brain-optimized encoding models to align image distributions with human brain activity.

作者信息

机构信息

出版信息

最近的两项进展加速了从人类大脑活动进行图像重建的进程

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献