Suppr超能文献

通过对比学习改善听力受损者在嘈杂环境中的听觉注意力解码。

Improving auditory attention decoding in noisy environments for listeners with hearing impairment through contrastive learning.

作者信息

Sridhar Gautam, Boselli Sofía, Skoglund Martin A, Bernhardsson Bo, Alickovic Emina

机构信息

Department of Automatic Control, Lund University, Lund, Sweden.

Eriksholm Research Centre, Snekkersten, Denmark.

出版信息

J Neural Eng. 2025 Jun 18;22(3). doi: 10.1088/1741-2552/ade28a.

Abstract

. This study aimed to investigate the potential of contrastive learning to improve auditory attention decoding (AAD) using electroencephalography (EEG) data in challenging cocktail-party scenarios with competing speech and background noise.. Three different models were implemented for comparison: a baseline linear model (LM), a non-LM without contrastive learning (NLM), and a non-LM with contrastive learning (NLMwCL). The EEG data and speech envelopes were used to train these models. The NLMwCL model used SigLIP, a variant of CLIP loss, to embed the data. The speech envelopes were reconstructed from the models and compared with the attended and ignored speech envelopes to assess reconstruction accuracy, measured as the correlation between the reconstructed and actual speech envelopes. These reconstruction accuracies were then compared to classify attention. All models were evaluated in 34 listeners with hearing impairment.. The reconstruction accuracy for attended and ignored speech, along with attention classification accuracy, was calculated for each model across various time windows. The NLMwCL consistently outperformed the other models in both speech reconstruction and attention classification. For a 3-second time window, the NLMwCL model achieved a mean attended speech reconstruction accuracy of 0.105 and a mean attention classification accuracy of 68.0%, while the NLM model scored 0.096 and 64.4%, and the LM achieved 0.084 and 62.6%, respectively.. These findings demonstrate the promise of contrastive learning in improving AAD and highlight the potential of EEG-based tools for clinical applications, and progress in hearing technology, particularly in the design of new neuro-steered signal processing algorithms.

摘要

本研究旨在探讨对比学习在具有竞争性语音和背景噪声的具有挑战性的鸡尾酒会场景中,利用脑电图(EEG)数据改善听觉注意力解码(AAD)的潜力。实施了三种不同的模型进行比较:基线线性模型(LM)、无对比学习的非LM(NLM)和有对比学习的非LM(NLMwCL)。EEG数据和语音包络用于训练这些模型。NLMwCL模型使用CLIP损失的变体SigLIP来嵌入数据。从模型中重建语音包络,并与被关注和被忽略的语音包络进行比较,以评估重建准确性,以重建语音包络与实际语音包络之间的相关性来衡量。然后比较这些重建准确性以进行注意力分类。所有模型在34名听力受损的听众中进行了评估。计算了每个模型在不同时间窗口内被关注和被忽略语音的重建准确性以及注意力分类准确性。NLMwCL在语音重建和注意力分类方面均始终优于其他模型。对于3秒的时间窗口,NLMwCL模型的平均被关注语音重建准确性为0.105,平均注意力分类准确性为68.0%,而NLM模型分别为0.096和64.4%,LM分别为0.084和62.6%。这些发现证明了对比学习在改善AAD方面的前景,并突出了基于EEG的工具在临床应用中的潜力,以及听力技术的进步,特别是在新型神经导向信号处理算法的设计方面。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验