Department of Psychology, School of Social Sciences, Tsinghua University, Beijing 100084, China; Tsinghua Laboratory of Brain and Intelligence, Tsinghua University, Beijing 100084, China.
Department of Education and Psychology, Freie Universität Berlin, Berlin 14195, Federal Republic of Germany.
Neuroimage. 2023 Nov 15;282:120404. doi: 10.1016/j.neuroimage.2023.120404. Epub 2023 Oct 6.
Despite the distortion of speech signals caused by unavoidable noise in daily life, our ability to comprehend speech in noisy environments is relatively stable. However, the neural mechanisms underlying reliable speech-in-noise comprehension remain to be elucidated. The present study investigated the neural tracking of acoustic and semantic speech information during noisy naturalistic speech comprehension. Participants listened to narrative audio recordings mixed with spectrally matched stationary noise at three signal-to-ratio (SNR) levels (no noise, 3 dB, -3 dB), and 60-channel electroencephalography (EEG) signals were recorded. A temporal response function (TRF) method was employed to derive event-related-like responses to the continuous speech stream at both the acoustic and the semantic levels. Whereas the amplitude envelope of the naturalistic speech was taken as the acoustic feature, word entropy and word surprisal were extracted via the natural language processing method as two semantic features. Theta-band frontocentral TRF responses to the acoustic feature were observed at around 400 ms following speech fluctuation onset over all three SNR levels, and the response latencies were more delayed with increasing noise. Delta-band frontal TRF responses to the semantic feature of word entropy were observed at around 200 to 600 ms leading to speech fluctuation onset over all three SNR levels. The response latencies became more leading with increasing noise and decreasing speech comprehension and intelligibility. While the following responses to speech acoustics were consistent with previous studies, our study revealed the robustness of leading responses to speech semantics, which suggests a possible predictive mechanism at the semantic level for maintaining reliable speech comprehension in noisy environments.
尽管日常生活中不可避免的噪声会导致语音信号失真,但我们在噪声环境中理解语音的能力相对稳定。然而,可靠的语音感知的神经机制仍有待阐明。本研究调查了在嘈杂的自然语言环境下理解语音时,声学和语义语音信息的神经追踪。参与者在三个信噪比(SNR)水平(无噪声、3dB、-3dB)下聆听混合有频谱匹配的固定噪声的叙事音频记录,并记录 60 通道脑电图(EEG)信号。采用时响应函数(TRF)方法,在声学和语义水平上对连续语音流产生与事件相关的类似反应。虽然自然语言处理方法提取的词熵和词意外性作为两个语义特征,自然语音的幅度包络被作为声学特征,但在所有三个 SNR 水平下,大约在语音波动开始后 400ms 时观察到了对声学特征的 theta 频段额中央 TRF 反应,并且随着噪声的增加,反应潜伏期变得更晚。在所有三个 SNR 水平下,大约在语音波动开始前 200 到 600ms 时,观察到了对语义特征词熵的 delta 频段额前部 TRF 反应。随着噪声的增加和语音理解和可懂度的降低,反应潜伏期变得更早。虽然后续对语音声学的反应与先前的研究一致,但我们的研究揭示了对语音语义的领先反应的稳健性,这表明在语义水平上存在一种可能的预测机制,以维持在噪声环境中可靠的语音理解。