中间声觉-语义表示将自然声音的行为和神经反应联系起来。

Intermediate acoustic-to-semantic representations link behavioral and neural responses to natural sounds.

机构信息

Institut de Neurosciences de La Timone, UMR 7289, CNRS and Université Aix-Marseille, Marseille, France.

Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, the Netherlands.

出版信息

Nat Neurosci. 2023 Apr;26(4):664-672. doi: 10.1038/s41593-023-01285-9. Epub 2023 Mar 16.

DOI:10.1038/s41593-023-01285-9

PMID:36928634

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10076214/

Abstract

Recognizing sounds implicates the cerebral transformation of input waveforms into semantic representations. Although past research identified the superior temporal gyrus (STG) as a crucial cortical region, the computational fingerprint of these cerebral transformations remains poorly characterized. Here, we exploit a model comparison framework and contrasted the ability of acoustic, semantic (continuous and categorical) and sound-to-event deep neural network representation models to predict perceived sound dissimilarity and 7 T human auditory cortex functional magnetic resonance imaging responses. We confirm that spectrotemporal modulations predict early auditory cortex (Heschl's gyrus) responses, and that auditory dimensions (for example, loudness, periodicity) predict STG responses and perceived dissimilarity. Sound-to-event deep neural networks predict Heschl's gyrus responses similar to acoustic models but, notably, they outperform all competing models at predicting both STG responses and perceived dissimilarity. Our findings indicate that STG entails intermediate acoustic-to-semantic sound representations that neither acoustic nor semantic models can account for. These representations are compositional in nature and relevant to behavior.

摘要

识别声音涉及将输入波形转化为语义表示的大脑转换。尽管过去的研究确定了颞上回（STG）是一个关键的皮质区域，但这些大脑转换的计算特征仍然描述不足。在这里，我们利用模型比较框架，对比了声学、语义（连续和分类）和声音到事件的深度神经网络表示模型的能力，以预测感知声音的不相似性和 7T 人类听觉皮层功能磁共振成像响应。我们证实，频谱时间调制预测早期听觉皮层（Heschl gyrus）的反应，而听觉维度（例如，响度、周期性）预测 STG 的反应和感知的不相似性。声音到事件的深度神经网络预测 Heschl gyrus 的反应与声学模型相似，但值得注意的是，它们在预测 STG 的反应和感知的不相似性方面都优于所有竞争模型。我们的发现表明，STG 需要中间的声学到语义的声音表示，而声学和语义模型都无法解释这些表示。这些表示具有组合性质，与行为有关。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad63/10076214/32162d488517/41593_2023_1285_Fig1_HTML.jpg

相似文献

Intermediate acoustic-to-semantic representations link behavioral and neural responses to natural sounds.中间声觉-语义表示将自然声音的行为和神经反应联系起来。

Nat Neurosci. 2023 Apr;26(4):664-672. doi: 10.1038/s41593-023-01285-9. Epub 2023 Mar 16.

Cortical processing of pitch: Model-based encoding and decoding of auditory fMRI responses to real-life sounds.皮层音高处理：基于模型的听觉 fMRI 响应对真实声音的编码和解码。

Neuroimage. 2018 Oct 15;180(Pt A):291-300. doi: 10.1016/j.neuroimage.2017.11.020. Epub 2017 Nov 13.

Stimulus-dependent activations and attention-related modulations in the auditory cortex: a meta-analysis of fMRI studies.听觉皮层中与刺激相关的激活和与注意相关的调制：功能磁共振成像研究的荟萃分析。

Hear Res. 2014 Jan;307:29-41. doi: 10.1016/j.heares.2013.08.001. Epub 2013 Aug 11.

Processing of natural sounds in human auditory cortex: tonotopy, spectral tuning, and relation to voice sensitivity.人类听觉皮层对自然声音的处理：音调拓扑、频谱调谐以及与语音敏感性的关系。

J Neurosci. 2012 Oct 10;32(41):14205-16. doi: 10.1523/JNEUROSCI.1388-12.2012.

Neural responses in human superior temporal cortex support coding of voice representations.人类上颞叶皮层的神经反应支持声音表示的编码。

PLoS Biol. 2022 Jul 28;20(7):e3001675. doi: 10.1371/journal.pbio.3001675. eCollection 2022 Jul.

Gamma Activation and Alpha Suppression within Human Auditory Cortex during a Speech Classification Task.人类听觉皮层在言语分类任务中的伽马激活和阿尔法抑制。

J Neurosci. 2022 Jun 22;42(25):5034-5046. doi: 10.1523/JNEUROSCI.2187-21.2022. Epub 2022 May 9.

Task-dependent decoding of speaker and vowel identity from auditory cortical response patterns.基于听觉皮层反应模式的说话人及元音身份的任务相关解码

J Neurosci. 2014 Mar 26;34(13):4548-57. doi: 10.1523/JNEUROSCI.4339-13.2014.

Neural dynamics underlying the acquisition of distinct auditory category structures.不同听觉类别结构习得的神经动力学基础。

Neuroimage. 2021 Dec 1;244:118565. doi: 10.1016/j.neuroimage.2021.118565. Epub 2021 Sep 17.

Cortical representation of natural complex sounds: effects of acoustic features and auditory object category.自然复杂声音的皮质代表：声音特征和听觉对象类别的影响。

J Neurosci. 2010 Jun 2;30(22):7604-12. doi: 10.1523/JNEUROSCI.0296-10.2010.

Encoding of natural sounds at multiple spectral and temporal resolutions in the human auditory cortex.人类听觉皮层中自然声音在多个频谱和时间分辨率下的编码。

PLoS Comput Biol. 2014 Jan;10(1):e1003412. doi: 10.1371/journal.pcbi.1003412. Epub 2014 Jan 2.

引用本文的文献

Deep neural networks explain spiking activity in auditory cortex.深度神经网络解释听觉皮层中的尖峰活动。

PLoS Comput Biol. 2025 Aug 25;21(8):e1013334. doi: 10.1371/journal.pcbi.1013334. eCollection 2025 Aug.

Alignment of auditory artificial networks with massive individual fMRI brain data leads to generalisable improvements in brain encoding and downstream tasks.听觉人工网络与大量个体功能磁共振成像脑数据的对齐可带来大脑编码及下游任务方面的普遍改善。

Imaging Neurosci (Camb). 2025 Apr 8;3. doi: 10.1162/imag_a_00525. eCollection 2025.

Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation.利用深度神经网络表示，可以从人类神经成像数据中重建自然声音。

PLoS Biol. 2025 Jul 23;23(7):e3003293. doi: 10.1371/journal.pbio.3003293. eCollection 2025 Jul.

Predicting artificial neural network representations to learn recognition model for music identification from brain recordings.预测人工神经网络表征以从脑电记录中学习用于音乐识别的识别模型。

Sci Rep. 2025 May 29;15(1):18869. doi: 10.1038/s41598-025-02790-6.

A large annotated dataset of vocalizations by common marmosets.普通狨猴发声的大型注释数据集。

Sci Data. 2025 May 13;12(1):782. doi: 10.1038/s41597-025-04951-8.

A hierarchy of processing complexity and timescales for natural sounds in the human auditory cortex.人类听觉皮层中自然声音的处理复杂性和时间尺度层次结构。

Proc Natl Acad Sci U S A. 2025 May 6;122(18):e2412243122. doi: 10.1073/pnas.2412243122. Epub 2025 Apr 28.

Whole-brain dynamics of articulatory, acoustic and semantic speech representations.发音、声学和语义语音表征的全脑动力学。

Commun Biol. 2025 Mar 13;8(1):432. doi: 10.1038/s42003-025-07862-x.

Neural processing of naturalistic audiovisual events in space and time.自然主义视听事件在时空上的神经处理。

Commun Biol. 2025 Jan 22;8(1):110. doi: 10.1038/s42003-024-07434-5.

Neural representation of sensorimotor features in language-motor areas during auditory and visual perception.听觉和视觉感知过程中语言运动区域感觉运动特征的神经表征。

Commun Biol. 2025 Jan 11;8(1):41. doi: 10.1038/s42003-025-07466-5.

Models optimized for real-world tasks reveal the task-dependent necessity of precise temporal coding in hearing.针对现实世界任务进行优化的模型揭示了听觉中精确时间编码的任务依赖性必要性。

Nat Commun. 2024 Dec 4;15(1):10590. doi: 10.1038/s41467-024-54700-5.

本文引用的文献

Parallel and distributed encoding of speech across human auditory cortex.人类听觉皮层中语音的并行和分布式编码。

Cell. 2021 Sep 2;184(18):4626-4639.e13. doi: 10.1016/j.cell.2021.07.019. Epub 2021 Aug 18.

Single-cell activity in human STG during perception of phonemes is organized according to manner of articulation.人类 STG 在感知音位时的单细胞活动是根据发音方式组织的。

Neuroimage. 2021 Feb 1;226:117499. doi: 10.1016/j.neuroimage.2020.117499. Epub 2020 Oct 24.

Estimating and interpreting nonlinear receptive field of sensory neural responses with deep neural network models.用深度神经网络模型估计和解释感觉神经反应的非线性感受野。

Elife. 2020 Jun 26;9:e53445. doi: 10.7554/eLife.53445.

Neural responses to natural and model-matched stimuli reveal distinct computations in primary and nonprimary auditory cortex.对自然和模型匹配刺激的神经反应揭示了初级和非初级听觉皮层中的不同计算。

PLoS Biol. 2018 Dec 3;16(12):e2005127. doi: 10.1371/journal.pbio.2005127. eCollection 2018 Dec.

A Task-Optimized Neural Network Replicates Human Auditory Behavior, Predicts Brain Responses, and Reveals a Cortical Processing Hierarchy.任务优化神经网络复制人类听觉行为，预测大脑反应，并揭示皮质处理层次结构。

Neuron. 2018 May 2;98(3):630-644.e16. doi: 10.1016/j.neuron.2018.03.044. Epub 2018 Apr 19.

Neuroimage. 2018 Oct 15;180(Pt A):291-300. doi: 10.1016/j.neuroimage.2017.11.020. Epub 2017 Nov 13.

Encoding of natural timbre dimensions in human auditory cortex.人类听觉皮层中自然音色维度的编码。

Neuroimage. 2018 Feb 1;166:60-70. doi: 10.1016/j.neuroimage.2017.10.050. Epub 2017 Nov 4.

Task-Modulated Cortical Representations of Natural Sound Source Categories.任务调制的自然声源类别皮质代表。

Cereb Cortex. 2018 Jan 1;28(1):295-306. doi: 10.1093/cercor/bhx263.

Deep Convolutional Neural Networks Outperform Feature-Based But Not Categorical Models in Explaining Object Similarity Judgments.在解释物体相似性判断方面，深度卷积神经网络的表现优于基于特征的模型，但不优于分类模型。

Front Psychol. 2017 Oct 9;8:1726. doi: 10.3389/fpsyg.2017.01726. eCollection 2017.

The Hierarchical Cortical Organization of Human Speech Processing.人类言语处理的分层皮质组织

J Neurosci. 2017 Jul 5;37(27):6539-6557. doi: 10.1523/JNEUROSCI.3267-16.2017. Epub 2017 Jun 6.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

中间声觉-语义表示将自然声音的行为和神经反应联系起来。

Intermediate acoustic-to-semantic representations link behavioral and neural responses to natural sounds.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献