幅度平方谱估计器及纳入信噪比不确定性的方法。

Estimators of The Magnitude-Squared Spectrum and Methods for Incorporating SNR Uncertainty.

作者信息

Lu Yang, Loizou Philipos C

机构信息

Department of Electrical Engineering, the University of Texas at Dallas, Richardson, TX, 75080, USA.

出版信息

IEEE Trans Audio Speech Lang Process. 2011 Jul 1;19(5):1123-1137. doi: 10.1109/TASL.2010.2082531.

DOI:10.1109/TASL.2010.2082531

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3163489/

Abstract

Statistical estimators of the magnitude-squared spectrum are derived based on the assumption that the magnitude-squared spectrum of the noisy speech signal can be computed as the sum of the (clean) signal and noise magnitude-squared spectra. Maximum a posterior (MAP) and minimum mean square error (MMSE) estimators are derived based on a Gaussian statistical model. The gain function of the MAP estimator was found to be identical to the gain function used in the ideal binary mask (IdBM) that is widely used in computational auditory scene analysis (CASA). As such, it was binary and assumed the value of 1 if the local SNR exceeded 0 dB, and assumed the value of 0 otherwise. By modeling the local instantaneous SNR as an F-distributed random variable, soft masking methods were derived incorporating SNR uncertainty. The soft masking method, in particular, which weighted the noisy magnitude-squared spectrum by the a priori probability that the local SNR exceeds 0 dB was shown to be identical to the Wiener gain function. Results indicated that the proposed estimators yielded significantly better speech quality than the conventional MMSE spectral power estimators, in terms of yielding lower residual noise and lower speech distortion.

摘要

基于噪声语音信号的幅度平方谱可计算为（纯净）信号和噪声幅度平方谱之和这一假设，推导了幅度平方谱的统计估计器。基于高斯统计模型推导了最大后验（MAP）估计器和最小均方误差（MMSE）估计器。发现MAP估计器的增益函数与计算听觉场景分析（CASA）中广泛使用的理想二元掩蔽（IdBM）中使用的增益函数相同。因此，它是二元的，当局部信噪比超过0 dB时取值为1，否则取值为0。通过将局部瞬时信噪比建模为F分布随机变量，推导了包含信噪比不确定性的软掩蔽方法。特别是，通过局部信噪比超过0 dB的先验概率对噪声幅度平方谱进行加权的软掩蔽方法被证明与维纳增益函数相同。结果表明，就产生更低的残余噪声和更低的语音失真而言，所提出的估计器产生的语音质量明显优于传统的MMSE谱功率估计器。

相似文献

1

Estimators of The Magnitude-Squared Spectrum and Methods for Incorporating SNR Uncertainty.幅度平方谱估计器及纳入信噪比不确定性的方法。

IEEE Trans Audio Speech Lang Process. 2011 Jul 1;19(5):1123-1137. doi: 10.1109/TASL.2010.2082531.

2

On training targets for deep learning approaches to clean speech magnitude spectrum estimation.深度学习方法在语音谱估计中对训练目标的研究。

J Acoust Soc Am. 2021 May;149(5):3273. doi: 10.1121/10.0004823.

3

Spectral distortion level resulting in a just-noticeable difference between an a priori signal-to-noise ratio estimate and its instantaneous case.导致先验信噪比估计与其瞬时情况之间存在可察觉差异的频谱失真程度。

J Acoust Soc Am. 2020 Oct;148(4):1879. doi: 10.1121/10.0002113.

4

A Laplacian-based MMSE estimator for speech enhancement.一种基于拉普拉斯的语音增强最小均方误差估计器。

Speech Commun. 2007 Feb;49(2):134-143. doi: 10.1016/j.specom.2006.12.005.

5

Speech enhancement via two-stage dual tree complex wavelet packet transform with a speech presence probability estimator.基于语音存在概率估计器的两阶段双树复数小波包变换语音增强

J Acoust Soc Am. 2017 Feb;141(2):808. doi: 10.1121/1.4976049.

6

Estimation of a priori signal-to-noise ratio using neurograms for speech enhancement.使用神经图估计先验信噪比以进行语音增强。

J Acoust Soc Am. 2020 Jun;147(6):3830. doi: 10.1121/10.0001324.

7

Role of mask pattern in intelligibility of ideal binary-masked noisy speech.掩码模式在理想二元掩码噪声语音可懂度中的作用。

J Acoust Soc Am. 2009 Sep;126(3):1415-26. doi: 10.1121/1.3179673.

8

Acoustical and Perceptual Analysis of Noise Reduction Strategies in Individuals With Auditory Neuropathy Spectrum Disorders.听觉神经病谱系障碍个体降噪策略的声学和感知分析

J Speech Lang Hear Res. 2020 Dec 14;63(12):4208-4218. doi: 10.1044/2020_JSLHR-20-00176. Epub 2020 Nov 11.

9

Noise reduction for heart sounds using a modified minimum-mean squared error estimator with ECG gating.使用带有心电图门控的改进型最小均方误差估计器对心音进行降噪。

Conf Proc IEEE Eng Med Biol Soc. 2006;2006:3385-90. doi: 10.1109/IEMBS.2006.259809.

10

The role of binary mask patterns in automatic speech recognition in background noise.二进制掩模模式在背景噪声中的自动语音识别中的作用。

J Acoust Soc Am. 2013 May;133(5):3083-93. doi: 10.1121/1.4798661.

本文引用的文献

1

An algorithm that improves speech intelligibility in noise for normal-hearing listeners.一种可提高听力正常的听众在噪声环境中语音清晰度的算法。

J Acoust Soc Am. 2009 Sep;126(3):1486-94. doi: 10.1121/1.3184603.

2

A geometric approach to spectral subtraction.一种用于谱减法的几何方法。

Speech Commun. 2008;50(6):453-466. doi: 10.1016/j.specom.2008.01.003.

3

Factors influencing intelligibility of ideal binary-masked speech: implications for noise reduction.影响理想二元掩蔽语音可懂度的因素：对降噪的启示

J Acoust Soc Am. 2008 Mar;123(3):1673-82. doi: 10.1121/1.2832617.

4

Subjective comparison and evaluation of speech enhancement algorithms.语音增强算法的主观比较与评估

Speech Commun. 2007 Jul;49(7):588-601. doi: 10.1016/j.specom.2006.12.006.

5

Isolating the energetic component of speech-on-speech masking with ideal time-frequency segregation.利用理想的时频分离来分离语音对语音掩蔽中的能量成分。

J Acoust Soc Am. 2006 Dec;120(6):4007-18. doi: 10.1121/1.2363929.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验