Institut de Neurosciences de La Timone, UMR 7289, CNRS and Université Aix-Marseille, Marseille, France.
Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, Maastricht, the Netherlands.
Nat Neurosci. 2023 Apr;26(4):664-672. doi: 10.1038/s41593-023-01285-9. Epub 2023 Mar 16.
Recognizing sounds implicates the cerebral transformation of input waveforms into semantic representations. Although past research identified the superior temporal gyrus (STG) as a crucial cortical region, the computational fingerprint of these cerebral transformations remains poorly characterized. Here, we exploit a model comparison framework and contrasted the ability of acoustic, semantic (continuous and categorical) and sound-to-event deep neural network representation models to predict perceived sound dissimilarity and 7 T human auditory cortex functional magnetic resonance imaging responses. We confirm that spectrotemporal modulations predict early auditory cortex (Heschl's gyrus) responses, and that auditory dimensions (for example, loudness, periodicity) predict STG responses and perceived dissimilarity. Sound-to-event deep neural networks predict Heschl's gyrus responses similar to acoustic models but, notably, they outperform all competing models at predicting both STG responses and perceived dissimilarity. Our findings indicate that STG entails intermediate acoustic-to-semantic sound representations that neither acoustic nor semantic models can account for. These representations are compositional in nature and relevant to behavior.
识别声音涉及将输入波形转化为语义表示的大脑转换。尽管过去的研究确定了颞上回(STG)是一个关键的皮质区域,但这些大脑转换的计算特征仍然描述不足。在这里,我们利用模型比较框架,对比了声学、语义(连续和分类)和声音到事件的深度神经网络表示模型的能力,以预测感知声音的不相似性和 7T 人类听觉皮层功能磁共振成像响应。我们证实,频谱时间调制预测早期听觉皮层(Heschl gyrus)的反应,而听觉维度(例如,响度、周期性)预测 STG 的反应和感知的不相似性。声音到事件的深度神经网络预测 Heschl gyrus 的反应与声学模型相似,但值得注意的是,它们在预测 STG 的反应和感知的不相似性方面都优于所有竞争模型。我们的发现表明,STG 需要中间的声学到语义的声音表示,而声学和语义模型都无法解释这些表示。这些表示具有组合性质,与行为有关。