• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用深度神经网络表示,可以从人类神经成像数据中重建自然声音。

Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation.

作者信息

Park Jong-Yun, Tsukamoto Mitsuaki, Tanaka Misato, Kamitani Yukiyasu

机构信息

Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University, Kyoto, Japan.

Department of Neuroinformatics, ATR Computational Neuroscience Laboratories, Kyoto, Japan.

出版信息

PLoS Biol. 2025 Jul 23;23(7):e3003293. doi: 10.1371/journal.pbio.3003293. eCollection 2025 Jul.

DOI:10.1371/journal.pbio.3003293
PMID:40700446
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12313072/
Abstract

Reconstruction of perceptual experiences from brain activity offers a unique window into how population neural responses represent sensory information. Although decoding visual content from functional MRI (fMRI) has seen significant success, reconstructing arbitrary sounds remains challenging due to the fine temporal structure of auditory signals and the coarse temporal resolution of fMRI. Drawing on the hierarchical auditory features of deep neural networks (DNNs) with progressively larger time windows and their neural activity correspondence, we introduce a method for sound reconstruction that integrates brain decoding of DNN features and an audio-generative model. DNN features decoded from auditory cortical activity outperformed spectrotemporal and modulation-based features, enabling perceptually plausible reconstructions across diverse sound categories. Behavioral evaluations and objective measures confirmed that these reconstructions preserved short-term spectral and perceptual properties, capturing the characteristic timbre of speech, animal calls, and musical instruments, while the reconstructed sounds did not reproduce longer temporal sequences with fidelity. Leave-category-out analyses indicated that the method generalizes across sound categories. Reconstructions at higher DNN layers and from early auditory regions revealed distinct contributions to decoding performance. Applying the model to a selective auditory attention ("cocktail party") task further showed that reconstructions reflected the attended sound more strongly than the unattended one in some of the subjects. Despite its inability to reconstruct exact temporal sequences, which may reflect the limited temporal resolution of fMRI, our framework demonstrates the feasibility of mapping brain activity to auditory experiences-a step toward more comprehensive understanding and reconstruction of internal auditory representations.

摘要

从大脑活动重建感知体验为了解群体神经反应如何表征感官信息提供了一个独特的窗口。尽管从功能磁共振成像(fMRI)中解码视觉内容已取得显著成功,但由于听觉信号精细的时间结构和fMRI粗糙的时间分辨率,重建任意声音仍然具有挑战性。利用具有逐渐增大时间窗口的深度神经网络(DNN)的分层听觉特征及其神经活动对应关系,我们引入了一种声音重建方法,该方法整合了DNN特征的脑解码和音频生成模型。从听觉皮层活动中解码出的DNN特征优于基于频谱时间和调制的特征,能够对不同声音类别进行在感知上合理的重建。行为评估和客观测量证实,这些重建保留了短期频谱和感知特性,捕捉到了语音、动物叫声和乐器的特征音色,而重建声音并没有忠实地再现更长的时间序列。留类别分析表明该方法在不同声音类别中具有通用性。在更高DNN层和早期听觉区域的重建揭示了对解码性能的不同贡献。将该模型应用于选择性听觉注意力(“鸡尾酒会”)任务进一步表明,在一些受试者中,重建对被关注声音的反映比对未被关注声音的反映更强。尽管该方法无法重建精确的时间序列,这可能反映了fMRI有限的时间分辨率,但我们的框架证明了将大脑活动映射到听觉体验的可行性——这是朝着更全面理解和重建内部听觉表征迈出的一步。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a79/12313072/bbf31f2abd08/pbio.3003293.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a79/12313072/3a73e624fdea/pbio.3003293.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a79/12313072/fa9199471bf7/pbio.3003293.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a79/12313072/86bea5384615/pbio.3003293.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a79/12313072/ef21d9d4bbcb/pbio.3003293.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a79/12313072/856c66931d83/pbio.3003293.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a79/12313072/63a57887a813/pbio.3003293.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a79/12313072/4aee76b8c461/pbio.3003293.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a79/12313072/bbf31f2abd08/pbio.3003293.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a79/12313072/3a73e624fdea/pbio.3003293.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a79/12313072/fa9199471bf7/pbio.3003293.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a79/12313072/86bea5384615/pbio.3003293.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a79/12313072/ef21d9d4bbcb/pbio.3003293.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a79/12313072/856c66931d83/pbio.3003293.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a79/12313072/63a57887a813/pbio.3003293.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a79/12313072/4aee76b8c461/pbio.3003293.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a79/12313072/bbf31f2abd08/pbio.3003293.g008.jpg

相似文献

1
Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation.利用深度神经网络表示,可以从人类神经成像数据中重建自然声音。
PLoS Biol. 2025 Jul 23;23(7):e3003293. doi: 10.1371/journal.pbio.3003293. eCollection 2025 Jul.
2
Short-Term Memory Impairment短期记忆障碍
3
Inter-individual deep image reconstruction via hierarchical neural code conversion.通过分层神经代码转换进行个体间深度图像重建。
Neuroimage. 2023 May 1;271:120007. doi: 10.1016/j.neuroimage.2023.120007. Epub 2023 Mar 11.
4
Retrieving and reconstructing conceptually similar images from fMRI with latent diffusion models and a neuro-inspired brain decoding model.使用潜在扩散模型和神经启发式脑解码模型从功能磁共振成像中检索和重建概念上相似的图像。
J Neural Eng. 2024 Jun 28;21(4). doi: 10.1088/1741-2552/ad593c.
5
Cortical temporal mismatch compensation in bimodal cochlear implant users: Selective attention decoding and pupillometry study.双模人工耳蜗使用者的皮质时间失配补偿:选择性注意解码与瞳孔测量研究。
Hear Res. 2025 Aug;464:109306. doi: 10.1016/j.heares.2025.109306. Epub 2025 May 15.
6
An auditory cortical-striatal circuit supports sound-triggered timing to predict future events.听觉皮层-纹状体回路支持声音触发的时机,以预测未来事件。
PLoS Biol. 2025 Jun 2;23(6):e3003209. doi: 10.1371/journal.pbio.3003209. eCollection 2025 Jun.
7
Does Augmenting Irradiated Autografts With Free Vascularized Fibula Graft in Patients With Bone Loss From a Malignant Tumor Achieve Union, Function, and Complication Rate Comparably to Patients Without Bone Loss and Augmentation When Reconstructing Intercalary Resections in the Lower Extremity?对于因恶性肿瘤导致骨缺损的患者,在重建下肢节段性切除时,采用带血管游离腓骨移植来增强照射后的自体骨移植,其骨愈合、功能及并发症发生率与无骨缺损且未进行增强的患者相比是否相当?
Clin Orthop Relat Res. 2025 Jun 26. doi: 10.1097/CORR.0000000000003599.
8
Decoding semantic sound categories in early visual cortex.早期视觉皮层中语义声音类别的解码
Cereb Cortex. 2025 Aug 1;35(8). doi: 10.1093/cercor/bhaf208.
9
Interference of mid-level speech and noise statistics underlies human speech recognition sensitivity in natural environmental noise.中级语音和噪声统计特性的干扰是自然环境噪声中人类语音识别敏感性的基础。
J Neurosci. 2025 Jul 8. doi: 10.1523/JNEUROSCI.1751-24.2025.
10
Sexual Harassment and Prevention Training性骚扰与预防培训

本文引用的文献

1
Spurious reconstruction from brain activity.基于大脑活动的虚假重建。
Neural Netw. 2025 Oct;190:107515. doi: 10.1016/j.neunet.2025.107515. Epub 2025 May 27.
2
Many but not all deep neural network audio models capture brain responses and exhibit correspondence between model stages and brain regions.许多(但不是全部)深度神经网络音频模型可以捕捉大脑反应,并在模型阶段和大脑区域之间表现出对应关系。
PLoS Biol. 2023 Dec 13;21(12):e3002366. doi: 10.1371/journal.pbio.3002366. eCollection 2023 Dec.
3
Reconstructing visual illusory experiences from human brain activity.
从人类大脑活动中重建视觉幻觉体验。
Sci Adv. 2023 Nov 17;9(46):eadj3906. doi: 10.1126/sciadv.adj3906. Epub 2023 Nov 15.
4
Model metamers reveal divergent invariances between biological and artificial neural networks.模型同型揭示了生物神经网络和人工神经网络之间的不同不变性。
Nat Neurosci. 2023 Nov;26(11):2017-2034. doi: 10.1038/s41593-023-01442-0. Epub 2023 Oct 16.
5
A high-performance neuroprosthesis for speech decoding and avatar control.一种用于语音解码和化身控制的高性能神经假体。
Nature. 2023 Aug;620(7976):1037-1046. doi: 10.1038/s41586-023-06443-4. Epub 2023 Aug 23.
6
Music can be reconstructed from human auditory cortex activity using nonlinear decoding models.音乐可以通过使用非线性解码模型从人类听觉皮层活动中重建。
PLoS Biol. 2023 Aug 15;21(8):e3002176. doi: 10.1371/journal.pbio.3002176. eCollection 2023 Aug.
7
Direct speech reconstruction from sensorimotor brain activity with optimized deep learning models.基于优化深度学习模型的感觉运动脑活动的直接语音重建。
J Neural Eng. 2023 Sep 20;20(5):056010. doi: 10.1088/1741-2552/ace8be.
8
Semantic reconstruction of continuous language from non-invasive brain recordings.从非侵入性脑记录中重建连续语言的语义。
Nat Neurosci. 2023 May;26(5):858-866. doi: 10.1038/s41593-023-01304-9. Epub 2023 May 1.
9
Intermediate acoustic-to-semantic representations link behavioral and neural responses to natural sounds.中间声觉-语义表示将自然声音的行为和神经反应联系起来。
Nat Neurosci. 2023 Apr;26(4):664-672. doi: 10.1038/s41593-023-01285-9. Epub 2023 Mar 16.
10
Successes and critical failures of neural networks in capturing human-like speech recognition.神经网络在捕捉类似人类的语音识别方面的成功和关键失败。
Neural Netw. 2023 May;162:199-211. doi: 10.1016/j.neunet.2023.02.032. Epub 2023 Feb 24.