Suppr超能文献

人类大脑中会说话面孔的模拟可提高听觉语音识别能力。

Simulation of talking faces in the human brain improves auditory speech recognition.

作者信息

von Kriegstein Katharina, Dogan Ozgür, Grüter Martina, Giraud Anne-Lise, Kell Christian A, Grüter Thomas, Kleinschmidt Andreas, Kiebel Stefan J

机构信息

Wellcome Trust Centre for Neuroimaging, University College London, Queen Square, London WC1N 3BG, United Kingdom.

出版信息

Proc Natl Acad Sci U S A. 2008 May 6;105(18):6747-52. doi: 10.1073/pnas.0710826105. Epub 2008 Apr 24.

Abstract

Human face-to-face communication is essentially audiovisual. Typically, people talk to us face-to-face, providing concurrent auditory and visual input. Understanding someone is easier when there is visual input, because visual cues like mouth and tongue movements provide complementary information about speech content. Here, we hypothesized that, even in the absence of visual input, the brain optimizes both auditory-only speech and speaker recognition by harvesting speaker-specific predictions and constraints from distinct visual face-processing areas. To test this hypothesis, we performed behavioral and neuroimaging experiments in two groups: subjects with a face recognition deficit (prosopagnosia) and matched controls. The results show that observing a specific person talking for 2 min improves subsequent auditory-only speech and speaker recognition for this person. In both prosopagnosics and controls, behavioral improvement in auditory-only speech recognition was based on an area typically involved in face-movement processing. Improvement in speaker recognition was only present in controls and was based on an area involved in face-identity processing. These findings challenge current unisensory models of speech processing, because they show that, in auditory-only speech, the brain exploits previously encoded audiovisual correlations to optimize communication. We suggest that this optimization is based on speaker-specific audiovisual internal models, which are used to simulate a talking face.

摘要

人类面对面交流本质上是视听结合的。通常,人们与我们面对面交谈,同时提供听觉和视觉输入。当有视觉输入时,理解他人会更容易,因为诸如嘴巴和舌头运动等视觉线索会提供有关言语内容的补充信息。在此,我们假设,即使在没有视觉输入的情况下,大脑也会通过从不同的视觉面部处理区域收集特定于说话者的预测和约束来优化仅基于听觉的语音和说话者识别。为了验证这一假设,我们对两组进行了行为和神经成像实验:患有面部识别缺陷(面孔失认症)的受试者和匹配的对照组。结果表明,观察特定的人说话两分钟可改善随后对该人的仅基于听觉的语音和说话者识别。在面孔失认症患者和对照组中,仅基于听觉的语音识别方面的行为改善都基于一个通常参与面部运动处理的区域。说话者识别方面的改善仅出现在对照组中,且基于一个参与面部身份处理的区域。这些发现挑战了当前的单感官语音处理模型,因为它们表明,在仅基于听觉的语音中,大脑利用先前编码的视听相关性来优化交流。我们认为这种优化基于特定于说话者的视听内部模型,该模型用于模拟说话的面孔。

相似文献

引用本文的文献

4
The Benefit of Bimodal Training in Voice Learning.双峰训练在语音学习中的益处。
Brain Sci. 2023 Aug 30;13(9):1260. doi: 10.3390/brainsci13091260.
10
Representation of Expression and Identity by Ventral Prefrontal Neurons.腹侧前额叶神经元的表达和身份表征。
Neuroscience. 2022 Aug 1;496:243-260. doi: 10.1016/j.neuroscience.2022.05.033. Epub 2022 May 30.

本文引用的文献

2
Hearing facial identities.识别面部身份。
Q J Exp Psychol (Hove). 2007 Oct;60(10):1446-56. doi: 10.1080/17470210601063589.
3
Hereditary prosopagnosia: the first case series.遗传性面孔失认症:首个病例系列
Cortex. 2007 Aug;43(6):734-49. doi: 10.1016/s0010-9452(08)70502-1.
7
Exploring the role of characteristic motion when learning new faces.探索学习新面孔时特征运动的作用。
Q J Exp Psychol (Hove). 2007 Apr;60(4):519-26. doi: 10.1080/17470210601117559.
8
The cortical organization of speech processing.言语处理的皮质组织。
Nat Rev Neurosci. 2007 May;8(5):393-402. doi: 10.1038/nrn2113. Epub 2007 Apr 13.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验