视听言语整合理论的一些行为和神经生物学限制：综述与新方向建议

Some behavioral and neurobiological constraints on theories of audiovisual speech integration: a review and suggestions for new directions.

作者信息

Altieri Nicholas, Pisoni David B, Townsend James T

机构信息

Department of Psychology, University of Oklahoma, OK 73072, USA.

出版信息

Seeing Perceiving. 2011;24(6):513-39. doi: 10.1163/187847611X595864. Epub 2011 Sep 29.

DOI:10.1163/187847611X595864

PMID:21968081

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3293210/

Abstract

Summerfield (1987) proposed several accounts of audiovisual speech perception, a field of research that has burgeoned in recent years. The proposed accounts included the integration of discrete phonetic features, vectors describing the values of independent acoustical and optical parameters, the filter function of the vocal tract, and articulatory dynamics of the vocal tract. The latter two accounts assume that the representations of audiovisual speech perception are based on abstract gestures, while the former two assume that the representations consist of symbolic or featural information obtained from visual and auditory modalities. Recent converging evidence from several different disciplines reveals that the general framework of Summerfield's feature-based theories should be expanded. An updated framework building upon the feature-based theories is presented. We propose a processing model arguing that auditory and visual brain circuits provide facilitatory information when the inputs are correctly timed, and that auditory and visual speech representations do not necessarily undergo translation into a common code during information processing. Future research on multisensory processing in speech perception should investigate the connections between auditory and visual brain regions, and utilize dynamic modeling tools to further understand the timing and information processing mechanisms involved in audiovisual speech integration.

摘要

萨默菲尔德（1987年）提出了几种关于视听言语感知的观点，这是一个近年来迅速发展的研究领域。所提出的观点包括离散语音特征的整合、描述独立声学和光学参数值的向量、声道的滤波函数以及声道的发音动力学。后两种观点假设视听言语感知的表征基于抽象手势，而前两种观点假设表征由从视觉和听觉模态获得的符号或特征信息组成。来自几个不同学科的最新汇聚证据表明，萨默菲尔德基于特征的理论的总体框架应该得到扩展。本文提出了一个基于这些基于特征的理论的更新框架。我们提出了一个处理模型，认为当输入时间正确时，听觉和视觉脑回路会提供促进信息，并且听觉和视觉言语表征在信息处理过程中不一定会转化为共同代码。未来关于言语感知中多感官处理的研究应该调查听觉和视觉脑区之间的联系，并利用动态建模工具来进一步理解视听言语整合中涉及的时间和信息处理机制。

相似文献

Some behavioral and neurobiological constraints on theories of audiovisual speech integration: a review and suggestions for new directions.

Seeing Perceiving. 2011;24(6):513-39. doi: 10.1163/187847611X595864. Epub 2011 Sep 29.

Neural Mechanisms Underlying Cross-Modal Phonetic Encoding.

J Neurosci. 2018 Feb 14;38(7):1835-1849. doi: 10.1523/JNEUROSCI.1566-17.2017. Epub 2017 Dec 20.

How are visemes and graphemes integrated with speech sounds during spoken word recognition? ERP evidence for supra-additive responses during audiovisual compared to auditory speech processing.

Brain Lang. 2022 Feb;225:105058. doi: 10.1016/j.bandl.2021.105058. Epub 2021 Dec 17.

Neurophysiological Indices of Audiovisual Speech Processing Reveal a Hierarchy of Multisensory Integration Effects.

J Neurosci. 2021 Jun 9;41(23):4991-5003. doi: 10.1523/JNEUROSCI.0906-20.2021. Epub 2021 Apr 6.

Electrophysiological evidence for a self-processing advantage during audiovisual speech integration.

Exp Brain Res. 2017 Sep;235(9):2867-2876. doi: 10.1007/s00221-017-5018-0. Epub 2017 Jul 4.

Timing in audiovisual speech perception: A mini review and new psychophysical data.

Atten Percept Psychophys. 2016 Feb;78(2):583-601. doi: 10.3758/s13414-015-1026-y.

Audiovisual integration of speech in a bistable illusion.

Curr Biol. 2009 May 12;19(9):735-9. doi: 10.1016/j.cub.2009.03.019. Epub 2009 Apr 2.

Audiovisual matching in speech and nonspeech sounds: a neurodynamical model.

J Cogn Neurosci. 2010 Feb;22(2):240-7. doi: 10.1162/jocn.2009.21202.

The timing of visual speech modulates auditory neural processing.

Brain Lang. 2022 Dec;235:105196. doi: 10.1016/j.bandl.2022.105196. Epub 2022 Oct 28.

Prediction and constraint in audiovisual speech perception.

Cortex. 2015 Jul;68:169-81. doi: 10.1016/j.cortex.2015.03.006. Epub 2015 Mar 20.

引用本文的文献

Increased Connectivity among Sensory and Motor Regions during Visual and Audiovisual Speech Perception.

J Neurosci. 2022 Jan 19;42(3):435-442. doi: 10.1523/JNEUROSCI.0114-21.2021. Epub 2021 Nov 23.

Audiovisual sentence recognition not predicted by susceptibility to the McGurk effect.

Atten Percept Psychophys. 2017 Feb;79(2):396-403. doi: 10.3758/s13414-016-1238-9.

Sensory-Cognitive Interactions in Older Adults.

Ear Hear. 2016 Jul-Aug;37 Suppl 1(Suppl 1):52S-61S. doi: 10.1097/AUD.0000000000000303.

Parallel linear dynamic models can mimic the McGurk effect in clinical populations.

J Comput Neurosci. 2016 Oct;41(2):143-55. doi: 10.1007/s10827-016-0610-z. Epub 2016 Jun 7.

Multisensory perception as an associative learning process.

Front Psychol. 2014 Sep 26;5:1095. doi: 10.3389/fpsyg.2014.01095. eCollection 2014.

Enhanced audiovisual integration with aging in speech perception: a heightened McGurk effect in older adults.

Front Psychol. 2014 Apr 14;5:323. doi: 10.3389/fpsyg.2014.00323. eCollection 2014.

Multisensory integration, learning, and the predictive coding hypothesis.

Front Psychol. 2014 Mar 24;5:257. doi: 10.3389/fpsyg.2014.00257. eCollection 2014.

Neural dynamics of audiovisual speech integration under variable listening conditions: an individual participant analysis.

Front Psychol. 2013 Sep 10;4:615. doi: 10.3389/fpsyg.2013.00615. eCollection 2013.

Speech through ears and eyes: interfacing the senses with the supramodal brain.

Front Psychol. 2013 Jul 12;4:388. doi: 10.3389/fpsyg.2013.00388. eCollection 2013.

本文引用的文献

A Longitudinal Study of Audiovisual Speech Perception by Children with Hearing Loss Who have Cochlear Implants.

Volta Rev. 2003;103(4):347-370.

Crossmodal Source Identification in Speech Perception.

Ecol Psychol. 2004;16(3):159-187. doi: 10.1207/s15326969eco1603_1.

Nice Guys Finish Fast and Bad Guys Finish Last: Facilitatory vs. Inhibitory Interaction in Parallel Systems.

J Math Psychol. 2011 Apr 1;55(2):176-190. doi: 10.1016/j.jmp.2010.11.003.

The optimal time window of visual-auditory integration: a reaction time analysis.

Front Integr Neurosci. 2010 May 11;4:11. doi: 10.3389/fnint.2010.00011. eCollection 2010.

Visual enhancement of the information representation in auditory cortex.

Curr Biol. 2010 Jan 12;20(1):19-24. doi: 10.1016/j.cub.2009.10.068. Epub 2009 Dec 31.

Spatial organization of multisensory responses in temporal association cortex.

J Neurosci. 2009 Sep 23;29(38):11924-32. doi: 10.1523/JNEUROSCI.3437-09.2009.

The natural statistics of audiovisual speech.

PLoS Comput Biol. 2009 Jul;5(7):e1000436. doi: 10.1371/journal.pcbi.1000436. Epub 2009 Jul 17.

Crossmodal interaction in speeded responses: time window of integration model.

Prog Brain Res. 2009;174:119-35. doi: 10.1016/S0079-6123(09)01311-9.

Mismatch negativity with visual-only and audiovisual speech.

Brain Topogr. 2009 May;21(3-4):207-15. doi: 10.1007/s10548-009-0094-5. Epub 2009 Apr 30.

Not just for bimodal neurons anymore: the contribution of unimodal neurons to cortical multisensory processing.

Brain Topogr. 2009 May;21(3-4):157-67. doi: 10.1007/s10548-009-0088-3. Epub 2009 Mar 27.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

视听言语整合理论的一些行为和神经生物学限制：综述与新方向建议

Some behavioral and neurobiological constraints on theories of audiovisual speech integration: a review and suggestions for new directions.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献