Fan Cunhang, Zhang Hongmei, Li Andong, Xiang Wang, Zheng Chengshi, Lv Zhao, Wu Xiaopei
Anhui Province Key Laboratory of Multimodal Cognitive Computation, School of Computer Science and Technology, Anhui University, Hefei 230601, China.
Key Laboratory of Noise and Vibration Research, Institute of Acoustics, Chinese Academy of Sciences, 100190, Beijing, China.
Neural Netw. 2023 Nov;168:508-517. doi: 10.1016/j.neunet.2023.09.041. Epub 2023 Sep 25.
Recent multi-domain processing methods have demonstrated promising performance for monaural speech enhancement tasks. However, few of them explain why they behave better over single-domain approaches. As an attempt to fill this gap, this paper presents a complementary single-channel speech enhancement network (CompNet) that demonstrates promising denoising capabilities and provides a unique perspective to understand the improvements introduced by multi-domain processing. Specifically, the noisy speech is initially enhanced through a time-domain network. However, despite the waveform can be feasibly recovered, the distribution of the time-frequency bins may still be partly different from the target spectrum when we reconsider the problem in the frequency domain. To solve this problem, we design a dedicated dual-path network as a post-processing module to independently filter the magnitude and refine the phase. This further drives the estimated spectrum to closely approximate the target spectrum in the time-frequency domain. We conduct extensive experiments with the WSJ0-SI84 and VoiceBank + Demand datasets. Objective test results show that the performance of the proposed system is highly competitive with existing systems.
最近的多域处理方法在单声道语音增强任务中表现出了良好的性能。然而,其中很少有方法能解释它们为何比单域方法表现得更好。作为填补这一空白的尝试,本文提出了一种互补单通道语音增强网络(CompNet),该网络展示了良好的去噪能力,并为理解多域处理带来的改进提供了独特的视角。具体而言,有噪语音首先通过一个时域网络进行增强。然而,尽管波形可以得到合理恢复,但当我们在频域重新考虑这个问题时,时频 bins 的分布可能仍然与目标频谱部分不同。为了解决这个问题,我们设计了一个专用的双路径网络作为后处理模块,以独立地对幅度进行滤波并细化相位。这进一步促使估计频谱在时频域中更接近目标频谱。我们使用 WSJ0 - SI84 和 VoiceBank + Demand 数据集进行了广泛的实验。客观测试结果表明,所提出系统的性能与现有系统相比具有很强的竞争力。