Ahmed Farhin, Nidiffer Aaron R, Lalor Edmund C
Department of Biomedical Engineering, Department of Neuroscience, and Del Monte Institute for Neuroscience, and Center for Visual Science, University of Rochester, Rochester, NY, United States.
Front Hum Neurosci. 2023 Dec 15;17:1283206. doi: 10.3389/fnhum.2023.1283206. eCollection 2023.
Seeing the speaker's face greatly improves our speech comprehension in noisy environments. This is due to the brain's ability to combine the auditory and the visual information around us, a process known as multisensory integration. Selective attention also strongly influences what we comprehend in scenarios with multiple speakers-an effect known as the cocktail-party phenomenon. However, the interaction between attention and multisensory integration is not fully understood, especially when it comes to natural, continuous speech. In a recent electroencephalography (EEG) study, we explored this issue and showed that multisensory integration is enhanced when an audiovisual speaker is attended compared to when that speaker is unattended. Here, we extend that work to investigate how this interaction varies depending on a person's gaze behavior, which affects the quality of the visual information they have access to. To do so, we recorded EEG from 31 healthy adults as they performed selective attention tasks in several paradigms involving two concurrently presented audiovisual speakers. We then modeled how the recorded EEG related to the audio speech (envelope) of the presented speakers. Crucially, we compared two classes of model - one that assumed underlying multisensory integration (AV) versus another that assumed two independent unisensory audio and visual processes (A+V). This comparison revealed evidence of strong attentional effects on multisensory integration when participants were looking directly at the face of an audiovisual speaker. This effect was not apparent when the speaker's face was in the peripheral vision of the participants. Overall, our findings suggest a strong influence of attention on multisensory integration when high fidelity visual (articulatory) speech information is available. More generally, this suggests that the interplay between attention and multisensory integration during natural audiovisual speech is dynamic and is adaptable based on the specific task and environment.
在嘈杂环境中,看到说话者的脸能极大地提高我们对言语的理解。这是因为大脑有能力将我们周围的听觉和视觉信息结合起来,这一过程被称为多感官整合。选择性注意也会强烈影响我们在有多个说话者的场景中的理解,这种效应被称为鸡尾酒会现象。然而,注意力和多感官整合之间的相互作用尚未完全被理解,尤其是在涉及自然、连续言语的情况下。在最近的一项脑电图(EEG)研究中,我们探讨了这个问题,并表明与未被关注的视听说话者相比,当一个视听说话者被关注时,多感官整合会增强。在这里,我们扩展这项工作,以研究这种相互作用如何根据一个人的注视行为而变化,注视行为会影响他们所能获得的视觉信息的质量。为此,我们记录了31名健康成年人在涉及两个同时呈现的视听说话者的几个范式中执行选择性注意任务时的脑电图。然后,我们对记录的脑电图与所呈现说话者的音频语音(包络)之间的关系进行了建模。至关重要的是,我们比较了两类模型——一类假设存在潜在的多感官整合(AV),另一类假设存在两个独立的单感官音频和视觉过程(A+V)。这种比较揭示了参与者直接看着视听说话者的脸时,注意力对多感官整合有强烈影响的证据。当说话者的脸处于参与者的周边视野时,这种效应并不明显。总体而言,我们的研究结果表明,当有高保真视觉(发音)语音信息时,注意力对多感官整合有强烈影响。更普遍地说,这表明在自然视听言语过程中,注意力和多感官整合之间的相互作用是动态的,并且可以根据特定任务和环境进行调整。