使用潜在扩散模型和神经启发式脑解码模型从功能磁共振成像中检索和重建概念上相似的图像。

Retrieving and reconstructing conceptually similar images from fMRI with latent diffusion models and a neuro-inspired brain decoding model.

作者信息

Ferrante Matteo, Boccato Tommaso, Passamonti Luca, Toschi Nicola

机构信息

Department of Biomedicine and Prevention, University of Rome, Tor Vergata, Rome, Italy.

CNR, Istituto di Bioimmagini e Fisiologia Molecolare, Milan, Italy.

出版信息

J Neural Eng. 2024 Jun 28;21(4). doi: 10.1088/1741-2552/ad593c.

DOI:10.1088/1741-2552/ad593c

PMID:38885689

Abstract

Brain decoding is a field of computational neuroscience that aims to infer mental states or internal representations of perceptual inputs from measurable brain activity. This study proposes a novel approach to brain decoding that relies on semantic and contextual similarity.We use several functional magnetic resonance imaging (fMRI) datasets of natural images as stimuli and create a deep learning decoding pipeline inspired by the bottom-up and top-down processes in human vision. Our pipeline includes a linear brain-to-feature model that maps fMRI activity to semantic visual stimuli features. We assume that the brain projects visual information onto a space that is homeomorphic to the latent space of last layer of a pretrained neural network, which summarizes and highlights similarities and differences between concepts. These features are categorized in the latent space using a nearest-neighbor strategy, and the results are used to retrieve images or condition a generative latent diffusion model to create novel images.We demonstrate semantic classification and image retrieval on three different fMRI datasets: Generic Object Decoding (vision perception and imagination), BOLD5000, and NSD. In all cases, a simple mapping between fMRI and a deep semantic representation of the visual stimulus resulted in meaningful classification and retrieved or generated images. We assessed quality using quantitative metrics and a human evaluation experiment that reproduces the multiplicity of conscious and unconscious criteria that humans use to evaluate image similarity. Our method achieved correct evaluation in over 80% of the test set.Our study proposes a novel approach to brain decoding that relies on semantic and contextual similarity. The results demonstrate that measurable neural correlates can be linearly mapped onto the latent space of a neural network to synthesize images that match the original content. These findings have implications for both cognitive neuroscience and artificial intelligence.

摘要

大脑解码是计算神经科学的一个领域，旨在从可测量的大脑活动中推断心理状态或感知输入的内部表征。本研究提出了一种依赖语义和上下文相似性的大脑解码新方法。我们使用几个自然图像的功能磁共振成像（fMRI）数据集作为刺激，并创建了一个受人类视觉中自下而上和自上而下过程启发的深度学习解码管道。我们的管道包括一个线性大脑到特征模型，该模型将fMRI活动映射到语义视觉刺激特征。我们假设大脑将视觉信息投射到一个与预训练神经网络最后一层的潜在空间同胚的空间，该空间总结并突出了概念之间的异同。这些特征在潜在空间中使用最近邻策略进行分类，结果用于检索图像或调整生成性潜在扩散模型以创建新图像。我们在三个不同的fMRI数据集上展示了语义分类和图像检索：通用对象解码（视觉感知和想象）、BOLD5000和NSD。在所有情况下，fMRI与视觉刺激的深度语义表示之间的简单映射都产生了有意义的分类以及检索或生成的图像。我们使用定量指标和一项人类评估实验来评估质量，该实验再现了人类用于评估图像相似性的有意识和无意识标准的多样性。我们的方法在超过80%的测试集中实现了正确评估。我们的研究提出了一种依赖语义和上下文相似性的大脑解码新方法。结果表明，可测量的神经关联可以线性映射到神经网络的潜在空间，以合成与原始内容匹配的图像。这些发现对认知神经科学和人工智能都有启示。