Suppr超能文献

ADT 网络:一种从 EEG 信号中解码语音包络的新型非线性方法。

ADT Network: A Novel Nonlinear Method for Decoding Speech Envelopes From EEG Signals.

机构信息

School of Intelligent Medicine, China Medical University, Shenyang, China.

Shengjing Hospital of China Medical University, Shenyang, China.

出版信息

Trends Hear. 2024 Jan-Dec;28:23312165241282872. doi: 10.1177/23312165241282872.

Abstract

Decoding speech envelopes from electroencephalogram (EEG) signals holds potential as a research tool for objectively assessing auditory processing, which could contribute to future developments in hearing loss diagnosis. However, current methods struggle to meet both high accuracy and interpretability. We propose a deep learning model called the auditory decoding transformer (ADT) network for speech envelope reconstruction from EEG signals to address these issues. The ADT network uses spatio-temporal convolution for feature extraction, followed by a transformer decoder to decode the speech envelopes. Through anticausal masking, the ADT considers only the current and future EEG features to match the natural relationship of speech and EEG. Performance evaluation shows that the ADT network achieves average reconstruction scores of 0.168 and 0.167 on the SparrKULee and DTU datasets, respectively, rivaling those of other nonlinear models. Furthermore, by visualizing the weights of the spatio-temporal convolution layer as time-domain filters and brain topographies, combined with an ablation study of the temporal convolution kernels, we analyze the behavioral patterns of the ADT network in decoding speech envelopes. The results indicate that low- (0.5-8 Hz) and high-frequency (14-32 Hz) EEG signals are more critical for envelope reconstruction and that the active brain regions are primarily distributed bilaterally in the auditory cortex, consistent with previous research. Visualization of attention scores further validated previous research. In summary, the ADT network balances high performance and interpretability, making it a promising tool for studying neural speech envelope tracking.

摘要

从脑电图 (EEG) 信号中解码语音包络有望成为客观评估听觉处理的研究工具,这可能有助于未来开发听力损失诊断方法。然而,目前的方法在准确性和可解释性方面都难以兼顾。我们提出了一种名为听觉解码转换器 (ADT) 的深度学习模型,用于从 EEG 信号中重建语音包络,以解决这些问题。ADT 网络使用时空卷积进行特征提取,然后使用转换器解码器对语音包络进行解码。通过因果掩蔽,ADT 仅考虑当前和未来的 EEG 特征,以匹配语音和 EEG 的自然关系。性能评估表明,ADT 网络在 SparrKULee 和 DTU 数据集上的平均重建得分分别为 0.168 和 0.167,与其他非线性模型相当。此外,通过将时空卷积层的权重可视化为时域滤波器和大脑地形图,并结合时间卷积核的消融研究,我们分析了 ADT 网络在解码语音包络方面的行为模式。结果表明,低(0.5-8 Hz)和高频(14-32 Hz)EEG 信号对包络重建更为关键,活跃的脑区主要分布在听觉皮层的双侧,与先前的研究一致。注意力得分的可视化进一步验证了先前的研究。综上所述,ADT 网络在性能和可解释性之间取得了平衡,是研究神经语音包络跟踪的有前途的工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b60/11489951/f6c8c3f42180/10.1177_23312165241282872-fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验