Schwartz Jean-Luc, Berthommier Frédéric, Savariaux Christophe
Institut de la Communication Parlée, CNRS-INPG-Université Stendhal, 46 Av. Félix Viallet, 38031 Grenoble 1, France.
Cognition. 2004 Sep;93(2):B69-78. doi: 10.1016/j.cognition.2004.01.006.
Lip reading is the ability to partially understand speech by looking at the speaker's lips. It improves the intelligibility of speech in noise when audio-visual perception is compared with audio-only perception. A recent set of experiments showed that seeing the speaker's lips also enhances sensitivity to acoustic information, decreasing the auditory detection threshold of speech embedded in noise [J. Acoust. Soc. Am. 109 (2001) 2272; J. Acoust. Soc. Am. 108 (2000) 1197]. However, detection is different from comprehension, and it remains to be seen whether improved sensitivity also results in an intelligibility gain in audio-visual speech perception. In this work, we use an original paradigm to show that seeing the speaker's lips enables the listener to hear better and hence to understand better. The audio-visual stimuli used here could not be differentiated by lip reading per se since they contained exactly the same lip gesture matched with different compatible speech sounds. Nevertheless, the noise-masked stimuli were more intelligible in the audio-visual condition than in the audio-only condition due to the contribution of visual information to the extraction of acoustic cues. Replacing the lip gesture by a non-speech visual input with exactly the same time course, providing the same temporal cues for extraction, removed the intelligibility benefit. This early contribution to audio-visual speech identification is discussed in relationships with recent neurophysiological data on audio-visual perception.
唇读是通过观察说话者的嘴唇来部分理解言语的能力。当将视听感知与仅听觉感知相比较时,它能提高噪声环境中言语的可懂度。最近的一组实验表明,看到说话者的嘴唇还能增强对声学信息的敏感度,降低嵌入噪声中的言语的听觉检测阈值[《美国声学学会杂志》109 (2001) 2272;《美国声学学会杂志》108 (2000) 1197]。然而,检测与理解不同,在视听言语感知中,敏感度的提高是否也会带来可懂度的提升还有待观察。在这项研究中,我们使用一种原创范式来表明,看到说话者的嘴唇能让听者听得更清楚,从而理解得更好。这里使用的视听刺激本身无法通过唇读来区分,因为它们包含完全相同的唇形动作,但与不同的匹配语音相对应。尽管如此,由于视觉信息对声学线索提取的贡献,在视听条件下,被噪声掩盖的刺激比仅听觉条件下更易懂。用具有完全相同时间进程、提供相同提取时间线索的非言语视觉输入替换唇形动作,可懂度优势就消失了。本文结合最近关于视听感知的神经生理学数据,对视听言语识别的这一早期贡献进行了讨论。