Suppr超能文献

用于联合声学回声和噪声抑制的神经级联架构

NEURAL CASCADE ARCHITECTURE FOR JOINT ACOUSTIC ECHO AND NOISE SUPPRESSION.

作者信息

Zhang Hao, Wang DeLiang

机构信息

Department of Computer Science and Engineering, The Ohio State University, USA.

Center for Cognitive and Brain Sciences, The Ohio State University, USA.

出版信息

Proc IEEE Int Conf Acoust Speech Signal Process. 2022 May;2022:671-675. doi: 10.1109/icassp43922.2022.9747445. Epub 2022 Apr 27.

Abstract

In this paper, we propose a neural cascade architecture for joint acoustic echo and noise suppression. The proposed cascade architecture consists of two modules. A convolutional recurrent network (CRN) is employed in the first module for complex spectral mapping. The output is then fed as an additional input to the second module, where a long short-term memory network (LSTM) is utilized for magnitude mask estimation. The entire architecture is trained in an end-to-end manner with the two modules optimized jointly using a single loss function. The final output is generated using the enhanced phase and magnitude obtained from the first and the second module, respectively. The cascade architecture enables the proposed method to obtain robust magnitude estimation as well as phase enhancement. Evaluation results show that the proposed method effectively suppresses acoustic echo and noise while preserving good speech quality, and significantly outperforms related methods.

摘要

在本文中,我们提出了一种用于联合声学回声和噪声抑制的神经级联架构。所提出的级联架构由两个模块组成。第一个模块采用卷积循环网络(CRN)进行复谱映射。然后,输出作为额外输入被馈送到第二个模块,在该模块中使用长短期记忆网络(LSTM)进行幅度掩码估计。整个架构以端到端的方式进行训练,两个模块使用单个损失函数进行联合优化。最终输出分别使用从第一个和第二个模块获得的增强相位和幅度生成。级联架构使所提出的方法能够获得稳健的幅度估计以及相位增强。评估结果表明,所提出的方法在保持良好语音质量的同时有效地抑制了声学回声和噪声,并且显著优于相关方法。

相似文献

1
NEURAL CASCADE ARCHITECTURE FOR JOINT ACOUSTIC ECHO AND NOISE SUPPRESSION.用于联合声学回声和噪声抑制的神经级联架构
Proc IEEE Int Conf Acoust Speech Signal Process. 2022 May;2022:671-675. doi: 10.1109/icassp43922.2022.9747445. Epub 2022 Apr 27.
2
Neural Cascade Architecture with Triple-domain Loss for Speech Enhancement.用于语音增强的具有三域损失的神经级联架构
IEEE/ACM Trans Audio Speech Lang Process. 2022;30:734-743. doi: 10.1109/taslp.2021.3138716. Epub 2021 Dec 28.

引用本文的文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验