Zhejiang Lab, Hangzhou 311121, China.
Key Laboratory for Biomedical Engineering of Ministry of Education, College of Biomedical Engineering and Instrument Sciences, Zhejiang University, Hangzhou 310027, China; The State Key Lab of Brain-Machine Intelligence; The MOE Frontier Science Center for Brain Science & Brain-machine Integration, Zhejiang University, Hangzhou 310027, China.
Neuroimage. 2024 Oct 15;300:120875. doi: 10.1016/j.neuroimage.2024.120875. Epub 2024 Sep 27.
In speech perception, low-frequency cortical activity tracks hierarchical linguistic units (e.g., syllables, phrases, and sentences) on top of acoustic features (e.g., speech envelope). Since the fluctuation of speech envelope typically corresponds to the syllabic boundaries, one common interpretation is that the acoustic envelope underlies the extraction of discrete syllables from continuous speech for subsequent linguistic processing. However, it remains unclear whether and how cortical activity encodes linguistic information when the speech envelope does not provide acoustic correlates of syllables. To address the issue, we introduced a frequency-tagging speech stream where the syllabic rhythm was obscured by echoic envelopes and investigated neural encoding of hierarchical linguistic information using electroencephalography (EEG). When listeners attended to the echoic speech, cortical activity showed reliable tracking of syllable, phrase, and sentence levels, among which the higher-level linguistic units elicited more robust neural responses. When attention was diverted from the echoic speech, reliable neural tracking of the syllable level was also observed in contrast to deteriorated neural tracking of the phrase and sentence levels. Further analyses revealed that the envelope aligned with the syllabic rhythm could be recovered from the echoic speech through a neural adaptation model, and the reconstructed envelope yielded higher predictive power for the neural tracking responses than either the original echoic envelope or anechoic envelope. Taken together, these results suggest that neural adaptation and attentional modulation jointly contribute to neural encoding of linguistic information in distorted speech where the syllabic rhythm is obscured by echoes.
在言语感知中,低频皮质活动在声学特征(如语音包络)之上跟踪分层语言单位(如音节、短语和句子)。由于语音包络的波动通常对应于音节边界,一种常见的解释是,语音包络为从连续语音中提取离散音节以供后续语言处理提供基础。然而,当语音包络不提供音节的声学对应时,皮质活动是否以及如何编码语言信息仍不清楚。为了解决这个问题,我们引入了一种频率标记语音流,其中音节节奏被回声包络掩盖,并使用脑电图(EEG)研究分层语言信息的神经编码。当听众关注回声语音时,皮质活动显示出对音节、短语和句子层次的可靠跟踪,其中更高层次的语言单位引起更强烈的神经反应。当注意力从回声语音转移时,即使短语和句子层次的神经跟踪恶化,也可以在对比中观察到对音节层次的可靠神经跟踪。进一步的分析表明,通过神经适应模型可以从回声语音中恢复与音节节奏对齐的包络,并且重建的包络比原始回声包络或无回声包络对神经跟踪反应具有更高的预测能力。总之,这些结果表明,神经适应和注意力调制共同促进了在音节节奏被回声掩盖的失真语音中语言信息的神经编码。