Department of Speech, Language, and Hearing Sciences, University of Florida, Gainesville.
J Speech Lang Hear Res. 2022 Jan 12;65(1):253-273. doi: 10.1044/2021_JSLHR-21-00177. Epub 2021 Nov 17.
It is well recognized that adding the visual to the acoustic speech signal improves recognition when the acoustic signal is degraded, but how that visual signal affects postrecognition processes is not so well understood. This study was designed to further elucidate the relationships among auditory and visual codes in working memory, a postrecognition process.
In a main experiment, 80 young adults with normal hearing were tested using an immediate serial recall paradigm. Three types of signals were presented (unprocessed speech, vocoded speech, and environmental sounds) in three conditions (audio-only, audio-video with dynamic visual signals, and audio-picture with static visual signals). Three dependent measures were analyzed: (a) magnitude of the recency effect, (b) overall recall accuracy, and (c) response times, to assess cognitive effort. In a follow-up experiment, 30 young adults with normal hearing were tested largely using the same procedures, but with a slight change in order of stimulus presentation.
The main experiment produced three major findings: (a) unprocessed speech evoked a recency effect of consistent magnitude across conditions; vocoded speech evoked a recency effect of similar magnitude to unprocessed speech only with dynamic visual (lipread) signals; environmental sounds never showed a recency effect. (b) Dynamic and static visual signals enhanced overall recall accuracy to a similar extent, and this enhancement was greater for vocoded speech and environmental sounds than for unprocessed speech. (c) All visual signals reduced cognitive load, except for dynamic visual signals with environmental sounds. The follow-up experiment revealed that dynamic visual (lipread) signals exerted their effect on the vocoded stimuli by enhancing phonological quality.
Acoustic and visual signals can combine to enhance working memory operations, but the source of these effects differs for phonological and nonphonological signals. Nonetheless, visual information can support better postrecognition processes for patients with hearing loss.
人们已经认识到,当语音信号受损时,将视觉信号添加到语音信号中可以提高识别率,但人们对视觉信号如何影响识别后的过程还不太了解。本研究旨在进一步阐明工作记忆中听觉和视觉代码之间的关系,工作记忆是识别后的一个过程。
在一项主要实验中,80 名听力正常的年轻人使用即时系列回忆范式进行了测试。三种类型的信号(未经处理的语音、语音编码语音和环境声音)在三种条件下呈现(仅音频、带有动态视觉信号的音频-视频和带有静态视觉信号的音频-图片)。分析了三个依赖指标:(a)近因效应的大小,(b)整体回忆准确性,和(c)反应时间,以评估认知努力。在后续实验中,30 名听力正常的年轻人使用大致相同的程序进行了测试,但刺激呈现的顺序略有变化。
主要实验产生了三个主要发现:(a)未经处理的语音在所有条件下都产生了一致大小的近因效应;语音编码语音仅在带有动态视觉(唇读)信号时产生与未经处理的语音相似大小的近因效应;环境声音从未表现出近因效应。(b)动态和静态视觉信号以相似的程度增强了整体回忆准确性,并且这种增强对于语音编码语音和环境声音比对未经处理的语音更大。(c)所有视觉信号都降低了认知负荷,除了环境声音的动态视觉信号。后续实验表明,动态视觉(唇读)信号通过增强语音质量对语音编码刺激产生影响。
声学和视觉信号可以结合起来增强工作记忆操作,但这些效果的来源对于语音和非语音信号是不同的。尽管如此,视觉信息可以为听力损失患者提供更好的识别后过程支持。