Maragos P, Potamianos A
Department of Electrical and Computer Engineering, National Technical University of Athens, Greece.
J Acoust Soc Am. 1999 Mar;105(3):1925-32. doi: 10.1121/1.426738.
The dynamics of airflow during speech production may often result in some small or large degree of turbulence. In this paper, the geometry of speech turbulence as reflected in the fragmentation of the time signal is quantified by using fractal models. An efficient algorithm for estimating the short-time fractal dimension of speech signals based on multiscale morphological filtering is described, and its potential for speech segmentation and phonetic classification discussed. Also reported are experimental results on using the short-time fractal dimension of speech signals at multiple scales as additional features in an automatic speech-recognition system using hidden Markov models, which provide a modest improvement in speech-recognition performance.
言语产生过程中的气流动力学常常会导致不同程度的湍流。本文利用分形模型对时间信号碎片化所反映的言语湍流几何特征进行量化。描述了一种基于多尺度形态滤波估计语音信号短时分数维的有效算法,并讨论了其在语音分割和语音分类方面的潜力。还报告了在使用隐马尔可夫模型的自动语音识别系统中,将多尺度语音信号的短时分数维作为附加特征的实验结果,该结果使语音识别性能有适度提高。