Kim Seon Man
Korea Photonics Technology Institute, Gwangju 61007, Korea.
Sensors (Basel). 2020 Oct 10;20(20):5751. doi: 10.3390/s20205751.
This paper proposes a novel technique to improve a spectral statistical filter for speech enhancement, to be applied in wearable hearing devices such as hearing aids. The proposed method is implemented considering a 32-channel uniform polyphase discrete Fourier transform filter bank, for which the overall algorithm processing delay is 8 ms in accordance with the hearing device requirements. The proposed speech enhancement technique, which exploits the concepts of both non-negative sparse coding (NNSC) and spectral statistical filtering, provides an online unified framework to overcome the problem of residual noise in spectral statistical filters under noisy environments. First, the spectral gain attenuator of the statistical Wiener filter is obtained using the a priori signal-to-noise ratio (SNR) estimated through a decision-directed approach. Next, the spectrum estimated using the Wiener spectral gain attenuator is decomposed by applying the NNSC technique to the target speech and residual noise components. These components are used to develop an NNSC-based Wiener spectral gain attenuator to achieve enhanced speech. The performance of the proposed NNSC-Wiener filter was evaluated through a perceptual evaluation of the speech quality scores under various noise conditions with SNRs ranging from -5 to 20 dB. The results indicated that the proposed NNSC-Wiener filter can outperform the conventional Wiener filter and NNSC-based speech enhancement methods at all SNRs.
本文提出了一种新颖的技术,用于改进一种用于语音增强的频谱统计滤波器,该滤波器将应用于诸如助听器等可穿戴听力设备中。所提出的方法是基于一个32通道均匀多相离散傅里叶变换滤波器组来实现的,根据听力设备的要求,该滤波器组的整体算法处理延迟为8毫秒。所提出的语音增强技术利用了非负稀疏编码(NNSC)和频谱统计滤波的概念,提供了一个在线统一框架,以克服噪声环境下频谱统计滤波器中的残余噪声问题。首先,使用通过决策导向方法估计的先验信噪比(SNR)来获得统计维纳滤波器的频谱增益衰减器。接下来,通过将NNSC技术应用于目标语音和残余噪声分量,对使用维纳频谱增益衰减器估计的频谱进行分解。这些分量用于开发基于NNSC的维纳频谱增益衰减器,以实现增强语音。通过在信噪比范围从-5到20 dB的各种噪声条件下对语音质量分数进行感知评估,对所提出的NNSC-维纳滤波器的性能进行了评估。结果表明,所提出的NNSC-维纳滤波器在所有信噪比下都优于传统维纳滤波器和基于NNSC的语音增强方法。