Department of Psychology, Carleton College, Northfield, MN, USA.
Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, MO, USA.
Psychon Bull Rev. 2019 Feb;26(1):291-297. doi: 10.3758/s13423-018-1489-7.
Speech recognition is improved when the acoustic input is accompanied by visual cues provided by a talking face (Erber in Journal of Speech and Hearing Research, 12(2), 423-425 1969; Sumby & Pollack in The Journal of the Acoustical Society of America, 26(2), 212-215, 1954). One way that the visual signal facilitates speech recognition is by providing the listener with information about fine phonetic detail that complements information from the auditory signal. However, given that degraded face stimuli can still improve speech recognition accuracy (Munhall et al. in Perception & Psychophysics, 66(4), 574-583, 2004), and static or moving shapes can improve speech detection accuracy (Bernstein et al. in Speech Communication, 44(1/4), 5-18, 2004), aspects of the visual signal other than fine phonetic detail may also contribute to the perception of speech. In two experiments, we show that a modulating circle providing information about the onset, offset, and acoustic amplitude envelope of the speech does not improve recognition of spoken sentences (Experiment 1) or words (Experiment 2), but does reduce the effort necessary to recognize speech. These results suggest that although fine phonetic detail may be required for the visual signal to benefit speech recognition, low-level features of the visual signal may function to reduce the cognitive effort associated with processing speech.
当声学输入伴随着说话人脸提供的视觉提示时,语音识别会得到改善(Erber 在《言语听觉研究杂志》,12(2),423-425 1969 年;Sumby & Pollack 在《美国声学学会杂志》,26(2),212-215,1954 年)。视觉信号促进语音识别的一种方式是为听者提供有关精细语音细节的信息,这些信息补充了来自听觉信号的信息。然而,即使是退化的人脸刺激也可以提高语音识别的准确性(Munhall 等人在《感知与心理物理学》,66(4),574-583,2004 年),静态或移动的形状可以提高语音检测的准确性(Bernstein 等人在《语音通信》,44(1/4),5-18,2004 年),除了精细的语音细节之外,视觉信号的其他方面也可能有助于语音感知。在两项实验中,我们表明,一个调制圆提供有关语音起始、结束和声学幅度包络的信息,不会提高对口语句子(实验 1)或单词(实验 2)的识别,但确实减少了识别语音所需的努力。这些结果表明,尽管视觉信号可能需要精细的语音细节才能受益于语音识别,但视觉信号的低水平特征可能有助于减少与处理语音相关的认知努力。