Zhang Changxin, Arnott Stephen R, Rabaglia Cristina, Avivi-Reich Meital, Qi James, Wu Xihong, Li Liang, Schneider Bruce A
Department of Psychology, Speech and Hearing Research Center, McGovern Institute for Brain Research at PKU, Key Laboratory on Machine Perception (Ministry of Education), Peking University, Beijing, China.
Rotman Research Institute, Baycrest Centre, Toronto, Ontario, Canada.
Hear Res. 2016 Jan;331:119-30. doi: 10.1016/j.heares.2015.11.002. Epub 2015 Nov 10.
To recognize speech in a noisy auditory scene, listeners need to perceptually segregate the target talker's voice from other competing sounds (stream segregation). A number of studies have suggested that the attentional demands placed on listeners increase as the acoustic properties and informational content of the competing sounds become more similar to that of the target voice. Hence we would expect attentional demands to be considerably greater when speech is masked by speech than when it is masked by steady-state noise. To investigate the role of attentional mechanisms in the unmasking of speech sounds, event-related potentials (ERPs) were recorded to a syllable masked by noise or competing speech under both active (the participant was asked to respond when the syllable was presented) or passive (no response was required) listening conditions. The results showed that the long-latency auditory response to a syllable (/bi/), presented at different signal-to-masker ratios (SMRs), was similar in both passive and active listening conditions, when the masker was a steady-state noise. In contrast, a switch from the passive listening condition to the active one, when the masker was two-talker speech, significantly enhanced the ERPs to the syllable. These results support the hypothesis that the need to engage attentional mechanisms in aid of scene analysis increases as the similarity (both acoustic and informational) between the target speech and the competing background sounds increases.
为了在嘈杂的听觉场景中识别语音,听众需要在感知上把目标说话者的声音与其他竞争声音分离开来(流分离)。许多研究表明,随着竞争声音的声学特性和信息内容与目标语音的相似度增加,听众所需的注意力需求也会增加。因此我们预计,当语音被语音掩盖时,注意力需求会比被稳态噪声掩盖时大得多。为了研究注意力机制在语音声音解掩蔽中的作用,在主动(要求参与者在音节呈现时做出反应)或被动(无需反应)聆听条件下,记录了与被噪声或竞争语音掩盖的音节相关的事件相关电位(ERP)。结果表明,当掩蔽声为稳态噪声时,在不同信号掩蔽比(SMR)下呈现的音节(/bi/)的长潜伏期听觉反应在被动和主动聆听条件下相似。相比之下,当掩蔽声为双说话者语音时,从被动聆听条件切换到主动聆听条件会显著增强对该音节的ERP。这些结果支持了这样一种假设,即随着目标语音与竞争背景声音之间的相似度(声学和信息方面)增加,借助场景分析来运用注意力机制的需求也会增加。