Neuroscience Program, Sabuncu Brain Research Center, Bilkent University, Ankara TR-06800, Turkey.
National Magnetic Resonance Research Center (UMRAM), Bilkent University, Ankara TR-06800, Turkey.
Cereb Cortex. 2021 Oct 1;31(11):4986-5005. doi: 10.1093/cercor/bhab136.
Humans are remarkably adept in listening to a desired speaker in a crowded environment, while filtering out nontarget speakers in the background. Attention is key to solving this difficult cocktail-party task, yet a detailed characterization of attentional effects on speech representations is lacking. It remains unclear across what levels of speech features and how much attentional modulation occurs in each brain area during the cocktail-party task. To address these questions, we recorded whole-brain blood-oxygen-level-dependent (BOLD) responses while subjects either passively listened to single-speaker stories, or selectively attended to a male or a female speaker in temporally overlaid stories in separate experiments. Spectral, articulatory, and semantic models of the natural stories were constructed. Intrinsic selectivity profiles were identified via voxelwise models fit to passive listening responses. Attentional modulations were then quantified based on model predictions for attended and unattended stories in the cocktail-party task. We find that attention causes broad modulations at multiple levels of speech representations while growing stronger toward later stages of processing, and that unattended speech is represented up to the semantic level in parabelt auditory cortex. These results provide insights on attentional mechanisms that underlie the ability to selectively listen to a desired speaker in noisy multispeaker environments.
人类在嘈杂的环境中倾听目标说话者的能力非常出色,同时可以过滤背景中的非目标说话者。注意力是解决这个困难的鸡尾酒会任务的关键,但注意力对语音表示的影响的详细特征描述还很缺乏。在鸡尾酒会任务中,大脑区域在什么层次的语音特征上发生了多少注意力调制,这一点还不清楚。为了解决这些问题,我们在被试者被动地听单个说话者的故事,或者在分别的实验中选择性地注意到时间上重叠的故事中的男性或女性说话者时,记录了全脑血氧水平依赖(BOLD)反应。构建了自然故事的频谱、发音和语义模型。通过对被动聆听反应进行拟合的体素模型,确定了内在选择性轮廓。然后,根据鸡尾酒会任务中对注意和未注意故事的模型预测,量化了注意力调制。我们发现,注意力引起了多个语音表示层次的广泛调制,并且随着处理阶段的推进,调制变得越来越强,未注意的语音在副听觉皮层中一直被表示到语义水平。这些结果为选择性地在嘈杂的多说话者环境中倾听目标说话者的注意力机制提供了深入的了解。