Altieri Nicholas, Pisoni David B, Townsend James T
Department of Psychology, University of Oklahoma, OK 73072, USA.
Seeing Perceiving. 2011;24(6):513-39. doi: 10.1163/187847611X595864. Epub 2011 Sep 29.
Summerfield (1987) proposed several accounts of audiovisual speech perception, a field of research that has burgeoned in recent years. The proposed accounts included the integration of discrete phonetic features, vectors describing the values of independent acoustical and optical parameters, the filter function of the vocal tract, and articulatory dynamics of the vocal tract. The latter two accounts assume that the representations of audiovisual speech perception are based on abstract gestures, while the former two assume that the representations consist of symbolic or featural information obtained from visual and auditory modalities. Recent converging evidence from several different disciplines reveals that the general framework of Summerfield's feature-based theories should be expanded. An updated framework building upon the feature-based theories is presented. We propose a processing model arguing that auditory and visual brain circuits provide facilitatory information when the inputs are correctly timed, and that auditory and visual speech representations do not necessarily undergo translation into a common code during information processing. Future research on multisensory processing in speech perception should investigate the connections between auditory and visual brain regions, and utilize dynamic modeling tools to further understand the timing and information processing mechanisms involved in audiovisual speech integration.
萨默菲尔德(1987年)提出了几种关于视听言语感知的观点,这是一个近年来迅速发展的研究领域。所提出的观点包括离散语音特征的整合、描述独立声学和光学参数值的向量、声道的滤波函数以及声道的发音动力学。后两种观点假设视听言语感知的表征基于抽象手势,而前两种观点假设表征由从视觉和听觉模态获得的符号或特征信息组成。来自几个不同学科的最新汇聚证据表明,萨默菲尔德基于特征的理论的总体框架应该得到扩展。本文提出了一个基于这些基于特征的理论的更新框架。我们提出了一个处理模型,认为当输入时间正确时,听觉和视觉脑回路会提供促进信息,并且听觉和视觉言语表征在信息处理过程中不一定会转化为共同代码。未来关于言语感知中多感官处理的研究应该调查听觉和视觉脑区之间的联系,并利用动态建模工具来进一步理解视听言语整合中涉及的时间和信息处理机制。