Abdel-Latif Khaled H A, Koelewijn Thomas, Başkent Deniz, Meister Hartmut
Faculty of Medicine and University Hospital Cologne, Department of Otorhinolaryngology, Head and Neck Surgery, University of Cologne, Cologne, Germany.
Jean Uhrmacher Institute for Clinical ENT-Research, University of Cologne, Cologne, Germany.
Trends Hear. 2025 Jan-Dec;29:23312165241306091. doi: 10.1177/23312165241306091.
Speech-on-speech masking is a common and challenging situation in everyday verbal communication. The ability to segregate competing auditory streams is a necessary requirement for focusing attention on the target speech. The Visual World Paradigm (VWP) provides insight into speech processing by capturing gaze fixations on visually presented icons that reflect the speech signal. This study aimed to propose a new VWP to examine the time course of speech segregation when competing sentences are presented and to collect pupil size data as a measure of listening effort. Twelve young normal-hearing participants were presented with competing matrix sentences (structure "name-verb-numeral-adjective-object") diotically via headphones at four target-to-masker ratios (TMRs), corresponding to intermediate to near perfect speech recognition. The VWP visually presented the number and object words from both the target and masker sentences. Participants were instructed to gaze at the corresponding words of the target sentence without providing verbal responses. The gaze fixations consistently reflected the different TMRs for both number and object words. The slopes of the fixation curves were steeper, and the proportion of target fixations increased with higher TMRs, suggesting more efficient segregation under more favorable conditions. Temporal analysis of pupil data using Bayesian paired sample -tests showed a corresponding reduction in pupil dilation with increasing TMR, indicating reduced listening effort. The results support the conclusion that the proposed VWP and the captured eye movements and pupil dilation are suitable for objective assessment of sentence-based speech-on-speech segregation and the corresponding listening effort.
语音对语音掩蔽是日常言语交流中常见且具有挑战性的情况。分离相互竞争的听觉流的能力是将注意力集中在目标语音上的必要条件。视觉世界范式(VWP)通过捕捉对反映语音信号的视觉呈现图标上的注视来深入了解语音处理过程。本研究旨在提出一种新的VWP,以检验在呈现相互竞争的句子时语音分离的时间进程,并收集瞳孔大小数据作为听觉努力的一种度量。通过耳机以四种目标与掩蔽比(TMR)向12名听力正常的年轻参与者双耳呈现相互竞争的矩阵句子(结构为“名称-动词-数字-形容词-宾语”),这些比率对应于中等至接近完美的语音识别。VWP在视觉上呈现目标句子和掩蔽句子中的数字和宾语单词。参与者被指示注视目标句子的相应单词,无需做出言语反应。注视始终反映了数字和宾语单词的不同TMR。注视曲线的斜率更陡,目标注视的比例随着TMR的增加而增加,这表明在更有利的条件下语音分离更有效。使用贝叶斯配对样本检验对瞳孔数据进行的时间分析表明,随着TMR的增加,瞳孔扩张相应减少,这表明听觉努力降低。结果支持以下结论:所提出的VWP以及所捕捉的眼动和瞳孔扩张适用于基于句子的语音对语音分离及相应听觉努力的客观评估。