Murphy Peter J, Akande Olatunji O
Department of Electronic and Computer Engineering, University of Limerick, Limerick, Ireland.
J Acoust Soc Am. 2007 Mar;121(3):1679-90. doi: 10.1121/1.2427123.
Cepstral-based estimation is used to provide a baseline estimate of the noise level in the logarithmic spectrum for voiced speech. A theoretical description of cepstral processing of voiced speech containing aspiration noise, together with supporting empirical data, is provided in order to illustrate the nature of the noise baseline estimation process. Taking the Fourier transform of the liftered (filtered in the cepstral domain) cepstrum produces a noise baseline estimate. It is shown that Fourier transforming the low-pass liftered cepstrum is comparable to applying a moving average (MA) filter to the logarithmic spectrum and hence the baseline receives contributions from the glottal source excited vocal tract and the noise excited vocal tract. Because the estimation process resembles the action of a MA filter, the resulting noise baseline is determined by the harmonic resolution (as determined by the temporal analysis window length) and the glottal source spectral tilt. On selecting an appropriate temporal analysis window length the estimated baseline is shown to lie halfway between the glottal excited vocal tract and the noise excited vocal tract. This information is employed in a new harmonics-to-noise (HNR) estimation technique, which is shown to provide accurate HNR estimates when tested on synthetically generated voice signals.
基于倒谱的估计用于为浊音语音的对数频谱中的噪声水平提供基线估计。本文给出了包含吸气噪声的浊音语音倒谱处理的理论描述,并辅以实证数据,以说明噪声基线估计过程的本质。对提升后的(在倒谱域中滤波)倒谱进行傅里叶变换可得到噪声基线估计。结果表明,对低通提升后的倒谱进行傅里叶变换相当于对对数频谱应用移动平均(MA)滤波器,因此基线接收来自声门源激励声道和噪声激励声道的贡献。由于估计过程类似于MA滤波器的作用,因此得到的噪声基线由谐波分辨率(由时间分析窗口长度决定)和声门源频谱倾斜度决定。在选择合适的时间分析窗口长度时,估计的基线显示位于声门激励声道和噪声激励声道之间的中间位置。该信息被应用于一种新的谐波噪声比(HNR)估计技术,在对合成生成的语音信号进行测试时,该技术能提供准确的HNR估计。