基于深度滤波器组均衡器的低延迟单声道语音增强。

Low-latency monaural speech enhancement with deep filter-bank equalizer.

机构信息

Key Laboratory of Noise and Vibration Research, Institute of Acoustics, Chinese Academy of Sciences, 100190 Beijing, China.

出版信息

J Acoust Soc Am. 2022 May;151(5):3291. doi: 10.1121/10.0011396.

DOI:10.1121/10.0011396

PMID:35649938

Abstract

It is highly desirable that speech enhancement algorithms can achieve good performance while keeping low latency for many applications, such as digital hearing aids, mobile phones, acoustically transparent hearing devices, and public address systems. To improve the performance of traditional low-latency speech enhancement algorithms, a deep filter-bank equalizer (FBE) framework was proposed that integrated a deep learning-based subband noise reduction network with a deep learning-based shortened digital filter mapping network. In the first network, a deep learning model was trained with a controllable small frame shift to satisfy the low-latency demand, i.e., no greater than 4 ms, so as to obtain (complex) subband gains that could be regarded as an adaptive digital filter in each frame. In the second network, to reduce the latency, this adaptive digital filter was implicitly shortened by a deep learning-based framework and was then applied to noisy speech to reconstruct the enhanced speech without the overlap-add method. Experimental results on the WSJ0-SI84 corpus indicated that the proposed DeepFBE with only 4-ms latency achieved much better performance than traditional low-latency speech enhancement algorithms across several objective metrics. Listening test results further confirmed that our approach achieved higher speech quality than other methods.

摘要

对于许多应用程序，如数字助听器、手机、透明听力设备和公共广播系统，希望语音增强算法能够在保持低延迟的同时实现良好的性能。为了提高传统低延迟语音增强算法的性能，提出了一种深度滤波器组均衡器 (FBE) 框架，该框架集成了基于深度学习的子带降噪网络和基于深度学习的缩短数字滤波器映射网络。在第一个网络中，使用可控制的小帧移训练深度学习模型，以满足低延迟需求，即不大于 4ms，从而获得（复）子带增益，这些增益可以在每帧中被视为自适应数字滤波器。在第二个网络中，为了降低延迟，通过基于深度学习的框架隐式缩短该自适应数字滤波器，然后将其应用于噪声语音，以无需重叠相加方法重建增强语音。在 WSJ0-SI84 语料库上的实验结果表明，与几种客观指标相比，具有仅 4ms 延迟的 DeepFBE 实现了更好的性能。听力测试结果进一步证实了我们的方法比其他方法具有更高的语音质量。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于深度滤波器组均衡器的低延迟单声道语音增强。

Low-latency monaural speech enhancement with deep filter-bank equalizer.

机构信息

出版信息

相似文献

引用本文的文献

基于深度滤波器组均衡器的低延迟单声道语音增强。

Low-latency monaural speech enhancement with deep filter-bank equalizer.

机构信息

出版信息

相似文献

引用本文的文献