• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ADT 网络:一种从 EEG 信号中解码语音包络的新型非线性方法。

ADT Network: A Novel Nonlinear Method for Decoding Speech Envelopes From EEG Signals.

机构信息

School of Intelligent Medicine, China Medical University, Shenyang, China.

Shengjing Hospital of China Medical University, Shenyang, China.

出版信息

Trends Hear. 2024 Jan-Dec;28:23312165241282872. doi: 10.1177/23312165241282872.

DOI:10.1177/23312165241282872
PMID:39397786
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11489951/
Abstract

Decoding speech envelopes from electroencephalogram (EEG) signals holds potential as a research tool for objectively assessing auditory processing, which could contribute to future developments in hearing loss diagnosis. However, current methods struggle to meet both high accuracy and interpretability. We propose a deep learning model called the auditory decoding transformer (ADT) network for speech envelope reconstruction from EEG signals to address these issues. The ADT network uses spatio-temporal convolution for feature extraction, followed by a transformer decoder to decode the speech envelopes. Through anticausal masking, the ADT considers only the current and future EEG features to match the natural relationship of speech and EEG. Performance evaluation shows that the ADT network achieves average reconstruction scores of 0.168 and 0.167 on the SparrKULee and DTU datasets, respectively, rivaling those of other nonlinear models. Furthermore, by visualizing the weights of the spatio-temporal convolution layer as time-domain filters and brain topographies, combined with an ablation study of the temporal convolution kernels, we analyze the behavioral patterns of the ADT network in decoding speech envelopes. The results indicate that low- (0.5-8 Hz) and high-frequency (14-32 Hz) EEG signals are more critical for envelope reconstruction and that the active brain regions are primarily distributed bilaterally in the auditory cortex, consistent with previous research. Visualization of attention scores further validated previous research. In summary, the ADT network balances high performance and interpretability, making it a promising tool for studying neural speech envelope tracking.

摘要

从脑电图 (EEG) 信号中解码语音包络有望成为客观评估听觉处理的研究工具,这可能有助于未来开发听力损失诊断方法。然而,目前的方法在准确性和可解释性方面都难以兼顾。我们提出了一种名为听觉解码转换器 (ADT) 的深度学习模型,用于从 EEG 信号中重建语音包络,以解决这些问题。ADT 网络使用时空卷积进行特征提取,然后使用转换器解码器对语音包络进行解码。通过因果掩蔽,ADT 仅考虑当前和未来的 EEG 特征,以匹配语音和 EEG 的自然关系。性能评估表明,ADT 网络在 SparrKULee 和 DTU 数据集上的平均重建得分分别为 0.168 和 0.167,与其他非线性模型相当。此外,通过将时空卷积层的权重可视化为时域滤波器和大脑地形图,并结合时间卷积核的消融研究,我们分析了 ADT 网络在解码语音包络方面的行为模式。结果表明,低(0.5-8 Hz)和高频(14-32 Hz)EEG 信号对包络重建更为关键,活跃的脑区主要分布在听觉皮层的双侧,与先前的研究一致。注意力得分的可视化进一步验证了先前的研究。综上所述,ADT 网络在性能和可解释性之间取得了平衡,是研究神经语音包络跟踪的有前途的工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b60/11489951/fd141acdc191/10.1177_23312165241282872-fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b60/11489951/f6c8c3f42180/10.1177_23312165241282872-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b60/11489951/5e9776b96a7f/10.1177_23312165241282872-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b60/11489951/1f1a4c4f27a5/10.1177_23312165241282872-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b60/11489951/1c3cc06f57f8/10.1177_23312165241282872-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b60/11489951/fe450272a30a/10.1177_23312165241282872-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b60/11489951/92b17a2c770a/10.1177_23312165241282872-fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b60/11489951/040e8bab0145/10.1177_23312165241282872-fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b60/11489951/fd141acdc191/10.1177_23312165241282872-fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b60/11489951/f6c8c3f42180/10.1177_23312165241282872-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b60/11489951/5e9776b96a7f/10.1177_23312165241282872-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b60/11489951/1f1a4c4f27a5/10.1177_23312165241282872-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b60/11489951/1c3cc06f57f8/10.1177_23312165241282872-fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b60/11489951/fe450272a30a/10.1177_23312165241282872-fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b60/11489951/92b17a2c770a/10.1177_23312165241282872-fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b60/11489951/040e8bab0145/10.1177_23312165241282872-fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5b60/11489951/fd141acdc191/10.1177_23312165241282872-fig8.jpg

相似文献

1
ADT Network: A Novel Nonlinear Method for Decoding Speech Envelopes From EEG Signals.ADT 网络:一种从 EEG 信号中解码语音包络的新型非线性方法。
Trends Hear. 2024 Jan-Dec;28:23312165241282872. doi: 10.1177/23312165241282872.
2
AADNet: An End-to-End Deep Learning Model for Auditory Attention Decoding.AADNet:一种用于听觉注意力解码的端到端深度学习模型。
IEEE Trans Neural Syst Rehabil Eng. 2025;33:2695-2706. doi: 10.1109/TNSRE.2025.3587637.
3
Improving auditory attention decoding in noisy environments for listeners with hearing impairment through contrastive learning.通过对比学习改善听力受损者在嘈杂环境中的听觉注意力解码。
J Neural Eng. 2025 Jun 18;22(3). doi: 10.1088/1741-2552/ade28a.
4
Cortical temporal mismatch compensation in bimodal cochlear implant users: Selective attention decoding and pupillometry study.双模人工耳蜗使用者的皮质时间失配补偿:选择性注意解码与瞳孔测量研究。
Hear Res. 2025 Aug;464:109306. doi: 10.1016/j.heares.2025.109306. Epub 2025 May 15.
5
A transformer-based network with second-order pooling for motor imagery EEG classification.一种用于运动想象脑电信号分类的基于二阶池化的变压器网络。
J Neural Eng. 2025 Jul 2. doi: 10.1088/1741-2552/adeae8.
6
Neural Decoding of the Speech Envelope: Effects of Intelligibility and Spectral Degradation.语音包络的神经解码:可懂度和频谱劣化的影响。
Trends Hear. 2024 Jan-Dec;28:23312165241266316. doi: 10.1177/23312165241266316.
7
Exploring the Potential of Electroencephalography Signal-Based Image Generation Using Diffusion Models: Integrative Framework Combining Mixed Methods and Multimodal Analysis.利用扩散模型探索基于脑电图信号的图像生成潜力:结合混合方法和多模态分析的综合框架
JMIR Med Inform. 2025 Jun 25;13:e72027. doi: 10.2196/72027.
8
Short-Term Memory Impairment短期记忆障碍
9
Contrastive representation learning with transformers for robust auditory EEG decoding.用于稳健听觉脑电信号解码的基于Transformer的对比表征学习
Sci Rep. 2025 Aug 6;15(1):28744. doi: 10.1038/s41598-025-13646-4.
10
EEG-based speech imagery decoding by dynamic hypergraph learning within projected and selected feature subspaces.基于脑电图的语音意象解码:通过在投影和选定特征子空间内进行动态超图学习实现
J Neural Eng. 2025 Jul 28;22(4). doi: 10.1088/1741-2552/adeec8.

本文引用的文献

1
Speech-induced suppression during natural dialogues.自然对话中的语音诱导抑制。
Commun Biol. 2024 Mar 8;7(1):291. doi: 10.1038/s42003-024-05945-9.
2
ERTNet: an interpretable transformer-based framework for EEG emotion recognition.ERTNet:一种基于可解释Transformer的脑电情感识别框架。
Front Neurosci. 2024 Jan 17;18:1320645. doi: 10.3389/fnins.2024.1320645. eCollection 2024.
3
Beyond linear neural envelope tracking: a mutual information approach.超越线性神经包络跟踪:一种互信息方法。
J Neural Eng. 2023 Mar 9;20(2). doi: 10.1088/1741-2552/acbe1d.
4
Decoding of the speech envelope from EEG using the VLAAI deep neural network.使用 VLAAI 深度神经网络对 EEG 进行语音包络解码。
Sci Rep. 2023 Jan 16;13(1):812. doi: 10.1038/s41598-022-27332-2.
5
Neural decoding of music from the EEG.脑电信号中音乐的神经解码。
Sci Rep. 2023 Jan 12;13(1):624. doi: 10.1038/s41598-022-27361-x.
6
Robust decoding of the speech envelope from EEG recordings through deep neural networks.通过深度神经网络从 EEG 记录中稳健地解码语音包络。
J Neural Eng. 2022 Jul 6;19(4). doi: 10.1088/1741-2552/ac7976.
7
Hearing loss is associated with delayed neural responses to continuous speech.听力损失与对连续语音的神经反应延迟有关。
Eur J Neurosci. 2022 Mar;55(6):1671-1690. doi: 10.1111/ejn.15644. Epub 2022 Mar 18.
8
Emergence of Lie Symmetries in Functional Architectures Learned by CNNs.卷积神经网络学习的功能架构中李对称的出现。
Front Comput Neurosci. 2021 Nov 22;15:694505. doi: 10.3389/fncom.2021.694505. eCollection 2021.
9
Predicting speech intelligibility from EEG in a non-linear classification paradigm.基于非线性分类范式的脑电语音可懂度预测。
J Neural Eng. 2021 Nov 15;18(6). doi: 10.1088/1741-2552/ac33e9.
10
Neural tracking of the fundamental frequency of the voice: The effect of voice characteristics.神经追踪基频:嗓音特征的影响。
Eur J Neurosci. 2021 Jun;53(11):3640-3653. doi: 10.1111/ejn.15229. Epub 2021 Apr 27.