Dong Chenjie, Wang Zhengye, Li Ruqin, Noppeney Uta, Wang Suiping
Philosophy and Social Science Laboratory of Reading and Development in Children and Adolescents, South China Normal University, Guangzhou, 510631, China.
Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands.
Psychon Bull Rev. 2025 Apr 24. doi: 10.3758/s13423-025-02697-3.
Face-to-face communication relies on integrating acoustic speech signals with corresponding facial articulations. Audiovisual integration abilities or deficits in typical and atypical populations are often assessed through their susceptibility to the McGurk illusion (i.e., their McGurk illusion rates). According to theories of normative Bayesian causal inference, observers integrate a visual /ga/ viseme and an auditory /ba/ phoneme weighted by their relative phonemic reliabilities into an illusory "da" percept. Consequently, McGurk illusion rates should be strongly influenced by observers' categorical perception of the corresponding facial articulatory movements and the acoustic signals. Across three experiments we investigated the extent to which variability in the McGurk illusion rate across participants or stimuli (i.e., speakers) can be explained by the corresponding variations in the categorical perception of the unisensory auditory and visual components. Additionally, we investigated whether the McGurk illusion susceptibility is a stable trait across different testing sessions (i.e., days) and tasks. Consistent with the principles of Bayesian Causal Inference, our results demonstrate that observers' tendency to (mis)perceive the auditory /ba/ and the visual /ga/ stimuli as "da" in unisensory contexts strongly predicts their McGurk illusion rates across both speakers and participants. Likewise, the stability in the McGurk illusion across sessions and tasks arises closely aligned with the corresponding stability of the unisensory auditory and visual categorical perception. Collectively, these findings highlight the importance of accounting for variations in unisensory performance and variability of materials (e.g., speakers) when using audiovisual illusions to assess audiovisual integration capability.
面对面交流依赖于将声学语音信号与相应的面部发音相结合。典型和非典型人群的视听整合能力或缺陷通常通过他们对麦格克错觉的易感性(即他们的麦格克错觉率)来评估。根据规范的贝叶斯因果推理理论,观察者将一个视觉 /ga/ 视位和一个听觉 /ba/ 音素,根据它们相对的音素可靠性进行加权,整合为一个虚幻的 “da” 感知。因此,麦格克错觉率应该会受到观察者对相应面部发音动作和声学信号的类别感知的强烈影响。在三个实验中,我们研究了参与者或刺激(即说话者)之间麦格克错觉率的变异性在多大程度上可以由单感官听觉和视觉成分的类别感知中的相应变化来解释。此外,我们还研究了麦格克错觉易感性在不同测试阶段(即不同日期)和任务中是否是一个稳定的特征。与贝叶斯因果推理的原则一致,我们的结果表明,观察者在单感官情境中将听觉 /ba/ 和视觉 /ga/ 刺激误感知为 “da” 的倾向,有力地预测了他们在不同说话者和参与者中的麦格克错觉率。同样,麦格克错觉在不同阶段和任务中的稳定性与单感官听觉和视觉类别感知的相应稳定性密切相关。总的来说,这些发现凸显了在使用视听错觉评估视听整合能力时,考虑单感官表现的变化和材料(如说话者)的变异性的重要性。