O'Hanlon Brandon, Plack Christopher J, Nuttall Helen E
Department of Psychology, Lancaster University, United Kingdom.
Manchester Centre for Audiology and Deafness, The University of Manchester, United Kingdom.
J Speech Lang Hear Res. 2025 Jan 2;68(1):26-39. doi: 10.1044/2024_JSLHR-24-00162. Epub 2024 Dec 2.
In difficult listening conditions, the visual system assists with speech perception through lipreading. Stimulus onset asynchrony (SOA) is used to investigate the interaction between the two modalities in speech perception. Previous estimates of audiovisual benefit and SOA integration period differ widely. A limitation of previous research is a lack of consideration of visemes-categories of phonemes defined by similar lip movements when produced by a speaker-to ensure that selected phonemes are visually distinct. This study aimed to reassess the benefits of audiovisual lipreading to speech perception when different viseme categories are selected as stimuli and presented in noise. The study also aimed to investigate the effects of SOA on these stimuli.
Sixty participants were tested online and presented with audio-only and audiovisual stimuli containing the speaker's lip movements. The speech was presented either with or without noise and had six different SOAs (0, 200, 216.6, 233.3, 250, and 266.6 ms). Participants discriminated between speech syllables with button presses.
The benefit of visual information was weaker than that in previous studies. There was a significant increase in reaction times as SOA was introduced, but there were no significant effects of SOA on accuracy. Furthermore, exploratory analyses suggest that the effect was not equal across viseme categories: "Ba" was more difficult to recognize than "ka" in noise.
In summary, the findings suggest that the contributions of audiovisual integration to speech processing are weaker when considering visemes but are not sufficient to identify a full integration period.
在困难的听力条件下,视觉系统通过唇读辅助言语感知。刺激起始异步性(SOA)用于研究言语感知中两种模态之间的相互作用。先前对视听益处和SOA整合期的估计差异很大。先前研究的一个局限性是,在选择音素时没有考虑由说话者产生的相似唇动定义的视位类别,以确保所选音素在视觉上是不同的。本研究旨在重新评估当选择不同视位类别作为刺激并在噪声中呈现时,视听唇读对言语感知的益处。该研究还旨在研究SOA对这些刺激的影响。
60名参与者进行在线测试,呈现包含说话者唇动的纯音频和视听刺激。言语在有或无噪声的情况下呈现,并有六种不同的SOA(0、200、216.6、233.3、250和266.6毫秒)。参与者通过按键区分语音音节。
视觉信息的益处比先前研究中的要弱。随着SOA的引入,反应时间显著增加,但SOA对准确性没有显著影响。此外,探索性分析表明,不同视位类别之间的影响并不相同:在噪声中,“Ba”比“ka”更难识别。
总之,研究结果表明,在考虑视位时,视听整合对言语处理的贡献较弱,但不足以确定一个完整的整合期。