Applied Neurocognitive Psychology Lab, University of Oldenburg, Oldenburg, Germany; Cluster of Excellence "Hearing4all", University of Oldenburg, Oldenburg, Germany.
Machine Learning Division, University of Oldenburg, Oldenburg, Germany; Cluster of Excellence "Hearing4all", University of Oldenburg, Oldenburg, Germany.
Neuroimage. 2021 Aug 15;237:118106. doi: 10.1016/j.neuroimage.2021.118106. Epub 2021 May 12.
Speech comprehension in natural soundscapes rests on the ability of the auditory system to extract speech information from a complex acoustic signal with overlapping contributions from many sound sources. Here we reveal the canonical processing of speech in natural soundscapes on multiple scales by using data-driven modeling approaches to characterize sounds to analyze ultra high field fMRI recorded while participants listened to the audio soundtrack of a movie. We show that at the functional level the neuronal processing of speech in natural soundscapes can be surprisingly low dimensional in the human cortex, highlighting the functional efficiency of the auditory system for a seemingly complex task. Particularly, we find that a model comprising three functional dimensions of auditory processing in the temporal lobes is shared across participants' fMRI activity. We further demonstrate that the three functional dimensions are implemented in anatomically overlapping networks that process different aspects of speech in natural soundscapes. One is most sensitive to complex auditory features present in speech, another to complex auditory features and fast temporal modulations, that are not specific to speech, and one codes mainly sound level. These results were derived with few a-priori assumptions and provide a detailed and computationally reproducible account of the cortical activity in the temporal lobe elicited by the processing of speech in natural soundscapes.
在自然环境中,语音理解依赖于听觉系统从具有许多声源重叠贡献的复杂声信号中提取语音信息的能力。在这里,我们通过使用数据驱动的建模方法来描述声音,分析参与者在观看电影音频配乐时记录的超高场 fMRI,揭示了自然环境中语音在多个尺度上的典型处理方式。我们表明,在功能水平上,人类大脑皮层中自然环境下语音的神经处理可以是惊人的低维的,这突出了听觉系统在看似复杂的任务中的功能效率。特别是,我们发现一个由颞叶中三个听觉处理功能维度组成的模型在参与者的 fMRI 活动中是共享的。我们进一步证明,这三个功能维度是在解剖上重叠的网络中实现的,这些网络处理自然环境中语音的不同方面。一个对语音中存在的复杂听觉特征最敏感,另一个对复杂听觉特征和快速时间调制最敏感,这些特征不是特定于语音的,而一个则主要对声音水平进行编码。这些结果是在很少的先验假设下得出的,并提供了一个详细的、可计算重复的自然环境中语音处理引起的颞叶皮层活动的描述。