Alsius Agnès, Wayne Rachel V, Paré Martin, Munhall Kevin G
Department of Psychology, Queen's University, Humphrey Hall, 62 Arch St, Kingston, Ontario, Canada, K7L 3N6.
Centre for Neuroscience Studies, Queen's University, Kingston, Ontario, Canada.
Atten Percept Psychophys. 2016 Jul;78(5):1472-87. doi: 10.3758/s13414-016-1109-4.
The basis for individual differences in the degree to which visual speech input enhances comprehension of acoustically degraded speech is largely unknown. Previous research indicates that fine facial detail is not critical for visual enhancement when auditory information is available; however, these studies did not examine individual differences in ability to make use of fine facial detail in relation to audiovisual speech perception ability. Here, we compare participants based on their ability to benefit from visual speech information in the presence of an auditory signal degraded with noise, modulating the resolution of the visual signal through low-pass spatial frequency filtering and monitoring gaze behavior. Participants who benefited most from the addition of visual information (high visual gain) were more adversely affected by the removal of high spatial frequency information, compared to participants with low visual gain, for materials with both poor and rich contextual cues (i.e., words and sentences, respectively). Differences as a function of gaze behavior between participants with the highest and lowest visual gains were observed only for words, with participants with the highest visual gain fixating longer on the mouth region. Our results indicate that the individual variance in audiovisual speech in noise performance can be accounted for, in part, by better use of fine facial detail information extracted from the visual signal and increased fixation on mouth regions for short stimuli. Thus, for some, audiovisual speech perception may suffer when the visual input (in addition to the auditory signal) is less than perfect.
视觉语音输入增强对声学退化语音理解程度的个体差异的基础在很大程度上尚不清楚。先前的研究表明,当有听觉信息时,精细的面部细节对于视觉增强并不关键;然而,这些研究并未考察在视听语音感知能力方面利用精细面部细节的能力的个体差异。在此,我们根据参与者在存在被噪声退化的听觉信号时从视觉语音信息中获益的能力进行比较,通过低通空间频率滤波调节视觉信号的分辨率并监测注视行为。与低视觉增益的参与者相比,对于具有较差和丰富上下文线索的材料(即分别为单词和句子),从添加视觉信息中获益最多(高视觉增益)的参与者受到高空间频率信息去除的负面影响更大。仅在单词方面观察到了最高和最低视觉增益参与者之间注视行为的差异,最高视觉增益的参与者在嘴部区域注视的时间更长。我们的结果表明,噪声环境下视听语音表现的个体差异部分可归因于对从视觉信号中提取的精细面部细节信息的更好利用以及对短刺激在嘴部区域的注视增加。因此,对于一些人来说,当视觉输入(除听觉信号外)不完美时,视听语音感知可能会受到影响。