Suppr超能文献

通过知识蒸馏和潜在扩散模型从脑电图中解码视觉大脑表示。

Decoding visual brain representations from electroencephalography through knowledge distillation and latent diffusion models.

机构信息

Department of Biomedicine and Prevention, University of Rome Tor Vergata (IT), Italy.

Department of Biomedicine and Prevention, University of Rome Tor Vergata (IT), Italy.

出版信息

Comput Biol Med. 2024 Aug;178:108701. doi: 10.1016/j.compbiomed.2024.108701. Epub 2024 Jun 7.

Abstract

Decoding visual representations from human brain activity has emerged as a thriving research domain, particularly in the context of brain-computer interfaces. Our study presents an innovative method that employs knowledge distillation to train an EEG classifier and reconstruct images from the ImageNet and THINGS-EEG 2 datasets using only electroencephalography (EEG) data from participants who have viewed the images themselves (i.e. "brain decoding"). We analyzed EEG recordings from 6 participants for the ImageNet dataset and 10 for the THINGS-EEG 2 dataset, exposed to images spanning unique semantic categories. These EEG readings were converted into spectrograms, which were then used to train a convolutional neural network (CNN), integrated with a knowledge distillation procedure based on a pre-trained Contrastive Language-Image Pre-Training (CLIP)-based image classification teacher network. This strategy allowed our model to attain a top-5 accuracy of 87%, significantly outperforming a standard CNN and various RNN-based benchmarks. Additionally, we incorporated an image reconstruction mechanism based on pre-trained latent diffusion models, which allowed us to generate an estimate of the images that had elicited EEG activity. Therefore, our architecture not only decodes images from neural activity but also offers a credible image reconstruction from EEG only, paving the way for, e.g., swift, individualized feedback experiments.

摘要

从人类大脑活动中解码视觉表示已经成为一个蓬勃发展的研究领域,特别是在脑机接口的背景下。我们的研究提出了一种创新的方法,该方法采用知识蒸馏来训练 EEG 分类器,并仅使用亲自观看图像的参与者的脑电图 (EEG) 数据从 ImageNet 和 THINGS-EEG 2 数据集重建图像(即“大脑解码”)。我们分析了来自 6 名参与者的 ImageNet 数据集和 10 名参与者的 THINGS-EEG 2 数据集的 EEG 记录,这些参与者接触到跨越独特语义类别的图像。这些 EEG 读数被转换为频谱图,然后用于训练卷积神经网络 (CNN),并结合基于预训练对比语言-图像预训练 (CLIP) 的图像分类教师网络的知识蒸馏过程。该策略使我们的模型达到了 87%的 top-5 准确率,明显优于标准 CNN 和各种基于 RNN 的基准。此外,我们还整合了基于预先训练的潜在扩散模型的图像重建机制,这使我们能够生成引发 EEG 活动的图像的估计。因此,我们的架构不仅可以从神经活动中解码图像,还可以仅从 EEG 提供可信的图像重建,为快速、个性化的反馈实验铺平道路。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验