• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

AADNet:一种用于听觉注意力解码的端到端深度学习模型。

AADNet: An End-to-End Deep Learning Model for Auditory Attention Decoding.

作者信息

Nguyen Nhan Duc Thanh, Phan Huy, Geirnaert Simon, Mikkelsen Kaare, Kidmose Preben

出版信息

IEEE Trans Neural Syst Rehabil Eng. 2025;33:2695-2706. doi: 10.1109/TNSRE.2025.3587637.

DOI:10.1109/TNSRE.2025.3587637
PMID:40633040
Abstract

Auditory attention decoding (AAD) is the process of identifying the attended speech in a multi-talker environment using brain signals, typically recorded through electroencephalography (EEG). Over the past decade, AAD has undergone continuous development, driven by its promising application in neuro-steered hearing devices. Most AAD algorithms are relying on the increase in neural entrainment to the envelope of attended speech, as compared to unattended speech, typically using a two-step approach. First, the algorithm predicts representations of the attended speech signal envelopes; second, it identifies the attended speech by finding the highest correlation between the predictions and the representations of the actual speech signals. In this study, we proposed a novel end-to-end neural network architecture, named AADNet, which combines these two stages into a direct approach to address the AAD problem. We compare the proposed network against traditional stimulus decoding-based approaches, including linear stimulus reconstruction, canonical correlation analysis, and an alternative non-linear stimulus reconstruction using three different datasets. AADNet shows a significant performance improvement for both subject-specific and subject-independent models. Notably, the average subject-independent classification accuracies for different analysis window lengths range from 56.3% (1 s) to 78.1% (20 s), 57.5% (1 s) to 89.4% (40 s), and 56.0% (1 s) to 82.6% (40 s) for three validated datasets, respectively, showing a significantly improved ability to generalize to data from unseen subjects. These results highlight the potential of deep learning models for advancing AAD, with promising implications for future hearing aids, assistive devices, and clinical assessments.

摘要

听觉注意力解码(AAD)是指在多说话者环境中利用脑信号识别被关注语音的过程,脑信号通常通过脑电图(EEG)记录。在过去十年中,由于AAD在神经导向听力设备中的应用前景广阔,它得到了持续发展。与未被关注的语音相比,大多数AAD算法依赖于神经对被关注语音包络的同步增强,通常采用两步法。首先,算法预测被关注语音信号包络的表示;其次,通过找到预测与实际语音信号表示之间的最高相关性来识别被关注语音。在本研究中,我们提出了一种新颖的端到端神经网络架构,名为AADNet,它将这两个阶段合并为一种直接方法来解决AAD问题。我们将所提出的网络与基于传统刺激解码的方法进行比较,包括线性刺激重建、典型相关分析以及使用三个不同数据集的另一种非线性刺激重建。对于特定受试者模型和非特定受试者模型,AADNet都显示出显著的性能提升。值得注意的是,对于三个经过验证的数据集,不同分析窗口长度下非特定受试者的平均分类准确率分别为56.3%(1秒)至78.1%(20秒)、57.5%(1秒)至89.4%(40秒)以及56.0%(1秒)至82.6%(40秒),这表明其对来自未见过的受试者的数据具有显著提高的泛化能力。这些结果凸显了深度学习模型在推进AAD方面的潜力,对未来的助听器、辅助设备和临床评估具有重要意义。

相似文献

1
AADNet: An End-to-End Deep Learning Model for Auditory Attention Decoding.AADNet:一种用于听觉注意力解码的端到端深度学习模型。
IEEE Trans Neural Syst Rehabil Eng. 2025;33:2695-2706. doi: 10.1109/TNSRE.2025.3587637.
2
Improving auditory attention decoding in noisy environments for listeners with hearing impairment through contrastive learning.通过对比学习改善听力受损者在嘈杂环境中的听觉注意力解码。
J Neural Eng. 2025 Jun 18;22(3). doi: 10.1088/1741-2552/ade28a.
3
Unsupervised Accuracy Estimation for Brain-Computer Interfaces Based on Selective Auditory Attention Decoding.基于选择性听觉注意解码的脑机接口无监督准确率估计
IEEE Trans Biomed Eng. 2025 Aug;72(8):2388-2399. doi: 10.1109/TBME.2025.3542253.
4
Multi-Class Decoding of Attended Speaker Direction Using Electroencephalogram and Audio Spatial Spectrum.利用脑电图和音频空间频谱对关注的说话者方向进行多类解码。
IEEE Trans Neural Syst Rehabil Eng. 2025;33:2892-2903. doi: 10.1109/TNSRE.2025.3591819.
5
Short-Term Memory Impairment短期记忆障碍
6
Cortical temporal mismatch compensation in bimodal cochlear implant users: Selective attention decoding and pupillometry study.双模人工耳蜗使用者的皮质时间失配补偿:选择性注意解码与瞳孔测量研究。
Hear Res. 2025 Aug;464:109306. doi: 10.1016/j.heares.2025.109306. Epub 2025 May 15.
7
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
8
Schizophrenia detection from electroencephalogram signals using image encoding and wrapper-based deep feature selection approach.基于图像编码和基于包装器的深度特征选择方法从脑电图信号中检测精神分裂症。
Sci Rep. 2025 Jul 1;15(1):21390. doi: 10.1038/s41598-025-06121-7.
9
Exploring the Potential of Electroencephalography Signal-Based Image Generation Using Diffusion Models: Integrative Framework Combining Mixed Methods and Multimodal Analysis.利用扩散模型探索基于脑电图信号的图像生成潜力:结合混合方法和多模态分析的综合框架
JMIR Med Inform. 2025 Jun 25;13:e72027. doi: 10.2196/72027.
10
Data Collection for Automatic Depression Identification in Spanish Speakers Using Deep Learning Algorithms: Protocol for a Case-Control Study.使用深度学习算法对西班牙语使用者进行自动抑郁症识别的数据收集:一项病例对照研究方案。
JMIR Res Protoc. 2025 Jul 31;14:e60439. doi: 10.2196/60439.