Suppr超能文献

一种因果推理模型解释了麦格克效应及其他不一致视听言语的感知。

A Causal Inference Model Explains Perception of the McGurk Effect and Other Incongruent Audiovisual Speech.

作者信息

Magnotti John F, Beauchamp Michael S

机构信息

Department of Neurosurgery and Core for Advanced MRI, Baylor College of Medicine, Houston, Texas, United States of America.

出版信息

PLoS Comput Biol. 2017 Feb 16;13(2):e1005229. doi: 10.1371/journal.pcbi.1005229. eCollection 2017 Feb.

Abstract

Audiovisual speech integration combines information from auditory speech (talker's voice) and visual speech (talker's mouth movements) to improve perceptual accuracy. However, if the auditory and visual speech emanate from different talkers, integration decreases accuracy. Therefore, a key step in audiovisual speech perception is deciding whether auditory and visual speech have the same source, a process known as causal inference. A well-known illusion, the McGurk Effect, consists of incongruent audiovisual syllables, such as auditory "ba" + visual "ga" (AbaVga), that are integrated to produce a fused percept ("da"). This illusion raises two fundamental questions: first, given the incongruence between the auditory and visual syllables in the McGurk stimulus, why are they integrated; and second, why does the McGurk effect not occur for other, very similar syllables (e.g., AgaVba). We describe a simplified model of causal inference in multisensory speech perception (CIMS) that predicts the perception of arbitrary combinations of auditory and visual speech. We applied this model to behavioral data collected from 60 subjects perceiving both McGurk and non-McGurk incongruent speech stimuli. The CIMS model successfully predicted both the audiovisual integration observed for McGurk stimuli and the lack of integration observed for non-McGurk stimuli. An identical model without causal inference failed to accurately predict perception for either form of incongruent speech. The CIMS model uses causal inference to provide a computational framework for studying how the brain performs one of its most important tasks, integrating auditory and visual speech cues to allow us to communicate with others.

摘要

视听语音整合结合了来自听觉语音(说话者的声音)和视觉语音(说话者的口部动作)的信息,以提高感知准确性。然而,如果听觉和视觉语音来自不同的说话者,整合会降低准确性。因此,视听语音感知中的一个关键步骤是确定听觉和视觉语音是否来自同一来源,这一过程称为因果推理。一种著名的错觉,即麦格克效应,由不一致的视听音节组成,例如听觉上的“ba”加上视觉上的“ga”(AbaVga),它们被整合以产生融合的感知(“da”)。这种错觉引发了两个基本问题:第一,鉴于麦格克刺激中听觉和视觉音节之间的不一致,为什么它们会被整合;第二,为什么麦格克效应不会出现在其他非常相似的音节(例如,AgaVba)上。我们描述了一种多感官语音感知中的因果推理简化模型(CIMS),该模型预测了听觉和视觉语音任意组合的感知。我们将此模型应用于从60名感知麦格克和非麦格克不一致语音刺激的受试者收集的行为数据。CIMS模型成功地预测了麦格克刺激中观察到的视听整合以及非麦格克刺激中观察到的缺乏整合。一个没有因果推理的相同模型未能准确预测任何一种不一致语音形式的感知。CIMS模型使用因果推理来提供一个计算框架,用于研究大脑如何执行其最重要的任务之一,即整合听觉和视觉语音线索以便我们与他人交流。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1050/5312805/428926d515a4/pcbi.1005229.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验