Suppr超能文献

利用机器学习从连续语音诱发的脑电图中解码听众的注意力

Machine learning for decoding listeners' attention from electroencephalography evoked by continuous speech.

作者信息

de Taillez Tobias, Kollmeier Birger, Meyer Bernd T

机构信息

Medizinische Physik and Cluster of Excellence Hearing4all, Carl von Ossietzky Universität, Oldenburg, 26129, Germany.

出版信息

Eur J Neurosci. 2020 Mar;51(5):1234-1241. doi: 10.1111/ejn.13790. Epub 2018 Jan 4.

Abstract

Previous research has shown that it is possible to predict which speaker is attended in a multispeaker scene by analyzing a listener's electroencephalography (EEG) activity. In this study, existing linear models that learn the mapping from neural activity to an attended speech envelope are replaced by a non-linear neural network (NN). The proposed architecture takes into account the temporal context of the estimated envelope and is evaluated using EEG data obtained from 20 normal-hearing listeners who focused on one speaker in a two-speaker setting. The network is optimized with respect to the frequency range and the temporal segmentation of the EEG input, as well as the cost function used to estimate the model parameters. To identify the salient cues involved in auditory attention, a relevance algorithm is applied that highlights the electrode signals most important for attention decoding. In contrast to linear approaches, the NN profits from a wider EEG frequency range (1-32 Hz) and achieves a performance seven times higher than the linear baseline. Relevant EEG activations following the speech stimulus after 170 ms at physiologically plausible locations were found. This was not observed when the model was trained on the unattended speaker. Our findings therefore indicate that non-linear NNs can provide insight into physiological processes by analyzing EEG activity.

摘要

先前的研究表明,通过分析听众的脑电图(EEG)活动,可以预测在多说话者场景中被关注的是哪个说话者。在本研究中,将学习从神经活动到被关注语音包络映射的现有线性模型替换为非线性神经网络(NN)。所提出的架构考虑了估计包络的时间背景,并使用从20名正常听力的听众那里获得的EEG数据进行评估,这些听众在双说话者环境中专注于一个说话者。该网络针对EEG输入的频率范围、时间分割以及用于估计模型参数的代价函数进行了优化。为了识别听觉注意力中涉及的显著线索,应用了一种相关性算法,该算法突出显示对注意力解码最重要的电极信号。与线性方法相比,神经网络从更宽的EEG频率范围(1 - 32赫兹)中受益,并且实现了比线性基线高七倍的性能。在生理上合理的位置,在语音刺激后170毫秒发现了相关的EEG激活。当模型在未被关注的说话者上进行训练时,未观察到这种情况。因此,我们的研究结果表明,非线性神经网络可以通过分析EEG活动来洞察生理过程。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验