Department of Signal Processing and Acoustics, Aalto University, P.O. Box 13000, FI-00076 Aalto, Finland.
J Acoust Soc Am. 2013 Aug;134(2):1295-313. doi: 10.1121/1.4812756.
All-pole modeling is a widely used formant estimation method, but its performance is known to deteriorate for high-pitched voices. In order to address this problem, several all-pole modeling methods robust to fundamental frequency have been proposed. This study compares five such previously known methods and introduces a technique, Weighted Linear Prediction with Attenuated Main Excitation (WLP-AME). WLP-AME utilizes temporally weighted linear prediction (LP) in which the square of the prediction error is multiplied by a given parametric weighting function. The weighting downgrades the contribution of the main excitation of the vocal tract in optimizing the filter coefficients. Consequently, the resulting all-pole model is affected more by the characteristics of the vocal tract leading to less biased formant estimates. By using synthetic vowels created with a physical modeling approach, the results showed that WLP-AME yields improved formant frequencies for high-pitched sounds in comparison to the previously known methods (e.g., relative error in the first formant of the vowel [a] decreased from 11% to 3% when conventional LP was replaced with WLP-AME). Experiments conducted on natural vowels indicate that the formants detected by WLP-AME changed in a more regular manner between repetitions of different pitch than those computed by conventional LP.
全极点建模是一种广泛使用的共振峰估计方法,但它的性能已知会随着音高的升高而恶化。为了解决这个问题,已经提出了几种对基频鲁棒的全极点建模方法。本研究比较了五种这样的已知方法,并介绍了一种技术,即带衰减主激励的加权线性预测(WLP-AME)。WLP-AME 利用时间加权线性预测(LP),其中预测误差的平方乘以给定的参数加权函数。这种加权方法降低了声道主激励在优化滤波器系数时的贡献。因此,所得到的全极点模型受声道特征的影响更大,从而导致共振峰估计的偏差更小。通过使用物理建模方法创建的合成元音进行的实验结果表明,与已知的方法相比,WLP-AME 产生了更高音的改进的共振峰频率(例如,当用 WLP-AME 替换传统的 LP 时,元音[a]的第一共振峰的相对误差从 11%降低到 3%)。在自然元音上进行的实验表明,与传统的 LP 计算的共振峰相比,WLP-AME 检测到的共振峰在不同音高的重复之间以更规则的方式变化。