Suppr超能文献

眼动追踪与视听信息受损语音的感知适应

Eye Gaze and Perceptual Adaptation to Audiovisual Degraded Speech.

机构信息

Division of Neuroscience and Experimental Psychology, Faculty of Biology, Medicine and Health, The University of Manchester, United Kingdom.

Manchester Centre for Audiology and Deafness, Faculty of Biology, Medicine and Health, The University of Manchester, United Kingdom.

出版信息

J Speech Lang Hear Res. 2021 Sep 14;64(9):3432-3445. doi: 10.1044/2021_JSLHR-21-00106. Epub 2021 Aug 31.

Abstract

Purpose Visual cues from a speaker's face may benefit perceptual adaptation to degraded speech, but current evidence is limited. We aimed to replicate results from previous studies to establish the extent to which visual speech cues can lead to greater adaptation over time, extending existing results to a real-time adaptation paradigm (i.e., without a separate training period). A second aim was to investigate whether eye gaze patterns toward the speaker's mouth were related to better perception, hypothesizing that listeners who looked more at the speaker's mouth would show greater adaptation. Method A group of listeners ( = 30) was presented with 90 noise-vocoded sentences in audiovisual format, whereas a control group ( = 29) was presented with the audio signal only. Recognition accuracy was measured throughout and eye tracking was used to measure fixations toward the speaker's eyes and mouth in the audiovisual group. Results Previous studies were partially replicated: The audiovisual group had better recognition throughout and adapted slightly more rapidly, but both groups showed an equal amount of improvement overall. Longer fixations on the speaker's mouth in the audiovisual group were related to better overall accuracy. An exploratory analysis further demonstrated that the duration of fixations to the speaker's mouth decreased over time. Conclusions The results suggest that visual cues may not benefit adaptation to degraded speech as much as previously thought. Longer fixations on a speaker's mouth may play a role in successfully decoding visual speech cues; however, this will need to be confirmed in future research to fully understand how patterns of eye gaze are related to audiovisual speech recognition. All materials, data, and code are available at https://osf.io/2wqkf/.

摘要

目的

说话人的面部视觉线索可能有助于感知适应语音质量下降,但目前的证据有限。我们旨在复制先前研究的结果,以确定视觉言语线索在多大程度上可以随着时间的推移导致更大的适应,将现有结果扩展到实时适应范式(即没有单独的训练期)。第二个目的是研究注视说话人嘴巴的眼球运动模式是否与更好的感知相关,假设更多注视说话人嘴巴的听众会表现出更大的适应。

方法

一组听众(n=30)以视听格式呈现 90 个噪声编码句子,而对照组(n=29)仅呈现音频信号。在整个过程中测量识别准确性,并使用眼动追踪测量视听组中注视说话人眼睛和嘴巴的注视次数。

结果

部分复制了先前的研究:视听组在整个过程中的识别率更高,适应速度略快,但两组的总体改善程度相同。视听组中对说话人嘴巴的注视时间更长与整体准确性更高相关。进一步的探索性分析表明,注视说话人嘴巴的时间随着时间的推移而减少。

结论

结果表明,视觉线索可能不像以前认为的那样有益于适应语音质量下降。对说话人嘴巴的长时间注视可能在成功解码视觉言语线索方面发挥作用;然而,这需要在未来的研究中得到证实,以充分了解眼球运动模式与视听言语识别的关系。所有材料、数据和代码均可在 https://osf.io/2wqkf/ 上获得。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验