Suppr超能文献

使用卷积神经网络实现稳健的语言语音表示的神经跟踪。

Robust neural tracking of linguistic speech representations using a convolutional neural network.

机构信息

Department Neurosciences, ExpORL, KU Leuven, Leuven, Belgium.

Department of Electrical engineering (ESAT), PSI, KU Leuven, Leuven, Belgium.

出版信息

J Neural Eng. 2023 Aug 30;20(4). doi: 10.1088/1741-2552/acf1ce.

Abstract

When listening to continuous speech, populations of neurons in the brain track different features of the signal. Neural tracking can be measured by relating the electroencephalography (EEG) and the speech signal. Recent studies have shown a significant contribution of linguistic features over acoustic neural tracking using linear models. However, linear models cannot model the nonlinear dynamics of the brain. To overcome this, we use a convolutional neural network (CNN) that relates EEG to linguistic features using phoneme or word onsets as a control and has the capacity to model non-linear relations.We integrate phoneme- and word-based linguistic features (phoneme surprisal, cohort entropy (CE), word surprisal (WS) and word frequency (WF)) in our nonlinear CNN model and investigate if they carry additional information on top of lexical features (phoneme and word onsets). We then compare the performance of our nonlinear CNN with that of a linear encoder and a linearized CNN.For the non-linear CNN, we found a significant contribution of CE over phoneme onsets and of WS and WF over word onsets. Moreover, the non-linear CNN outperformed the linear baselines.Measuring coding of linguistic features in the brain is important for auditory neuroscience research and applications that involve objectively measuring speech understanding. With linear models, this is measurable, but the effects are very small. The proposed non-linear CNN model yields larger differences between linguistic and lexical models and, therefore, could show effects that would otherwise be unmeasurable and may, in the future, lead to improved within-subject measures and shorter recordings.

摘要

当人们聆听连续的语音时,大脑中的神经元群体可以跟踪信号的不同特征。可以通过将脑电图 (EEG) 与语音信号相关联来测量神经跟踪。最近的研究表明,使用线性模型时,语言特征对声学神经跟踪有显著贡献。然而,线性模型无法模拟大脑的非线性动态。为了克服这一问题,我们使用卷积神经网络 (CNN),该网络使用音素或单词起始作为控制,将 EEG 与语言特征相关联,并具有建模非线性关系的能力。我们将基于音素和基于单词的语言特征(音素惊讶度、群体熵 (CE)、单词惊讶度 (WS) 和单词频率 (WF))整合到我们的非线性 CNN 模型中,并研究它们是否在词汇特征(音素和单词起始)之上提供了额外的信息。然后,我们将我们的非线性 CNN 与线性编码器和线性化 CNN 的性能进行比较。对于非线性 CNN,我们发现 CE 对音素起始的贡献显著,而 WS 和 WF 对单词起始的贡献显著。此外,非线性 CNN 的性能优于线性基线。测量大脑中语言特征的编码对于涉及客观测量言语理解的听觉神经科学研究和应用非常重要。使用线性模型是可以测量的,但效果非常小。所提出的非线性 CNN 模型在语言和词汇模型之间产生了更大的差异,因此可以显示出否则无法测量的效果,并且可能在未来导致改进的个体内测量和更短的记录。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验