Institute of Cognitive Neuroscience, University College London, 17 Queen Square, London WC1N 3AR, UK.
Neuropsychologia. 2012 Apr;50(5):762-76. doi: 10.1016/j.neuropsychologia.2012.01.010. Epub 2012 Jan 17.
Speech comprehension is a complex human skill, the performance of which requires the perceiver to combine information from several sources - e.g. voice, face, gesture, linguistic context - to achieve an intelligible and interpretable percept. We describe a functional imaging investigation of how auditory, visual and linguistic information interact to facilitate comprehension. Our specific aims were to investigate the neural responses to these different information sources, alone and in interaction, and further to use behavioural speech comprehension scores to address sites of intelligibility-related activation in multifactorial speech comprehension. In fMRI, participants passively watched videos of spoken sentences, in which we varied Auditory Clarity (with noise-vocoding), Visual Clarity (with Gaussian blurring) and Linguistic Predictability. Main effects of enhanced signal with increased auditory and visual clarity were observed in overlapping regions of posterior STS. Two-way interactions of the factors (auditory × visual, auditory × predictability) in the neural data were observed outside temporal cortex, where positive signal change in response to clearer facial information and greater semantic predictability was greatest at intermediate levels of auditory clarity. Overall changes in stimulus intelligibility by condition (as determined using an independent behavioural experiment) were reflected in the neural data by increased activation predominantly in bilateral dorsolateral temporal cortex, as well as inferior frontal cortex and left fusiform gyrus. Specific investigation of intelligibility changes at intermediate auditory clarity revealed a set of regions, including posterior STS and fusiform gyrus, showing enhanced responses to both visual and linguistic information. Finally, an individual differences analysis showed that greater comprehension performance in the scanning participants (measured in a post-scan behavioural test) were associated with increased activation in left inferior frontal gyrus and left posterior STS. The current multimodal speech comprehension paradigm demonstrates recruitment of a wide comprehension network in the brain, in which posterior STS and fusiform gyrus form sites for convergence of auditory, visual and linguistic information, while left-dominant sites in temporal and frontal cortex support successful comprehension.
言语理解是一项复杂的人类技能,其表现需要感知者结合来自多个来源的信息,例如声音、面部、姿势、语言语境,以获得可理解和可解释的感知。我们描述了一项功能成像研究,探讨听觉、视觉和语言信息如何相互作用以促进理解。我们的具体目标是研究这些不同信息源单独和相互作用时的神经反应,并进一步使用行为言语理解分数来解决多因素言语理解中与可理解性相关的激活部位。在 fMRI 中,参与者被动观看口语句子的视频,我们在这些句子中改变听觉清晰度(带噪声编码)、视觉清晰度(带高斯模糊)和语言可预测性。在后部 STS 重叠区域观察到信号增强与听觉和视觉清晰度增加的主要效应。在颞叶皮层之外观察到因素(听觉×视觉、听觉×可预测性)的双向相互作用,在这些区域,对更清晰的面部信息和更大语义可预测性的反应中,信号变化的正性在中等听觉清晰度水平最大。通过条件(通过独立行为实验确定)改变刺激的可理解性,在神经数据中反映为双侧背外侧颞叶皮层、下额前皮层和左侧梭状回的激活增加。对中等听觉清晰度的可理解性变化的具体研究表明,一组区域,包括后部 STS 和梭状回,对视觉和语言信息均表现出增强的反应。最后,个体差异分析表明,扫描参与者(在扫描后的行为测试中测量)的理解表现越好,左额下回和左后部 STS 的激活就越大。当前的多模态言语理解范式表明,大脑中广泛的理解网络被招募,其中后部 STS 和梭状回形成听觉、视觉和语言信息汇聚的部位,而颞叶和额叶皮层的左侧优势部位支持成功的理解。