Drioli Carlo
Department of Speech, Music and Hearing, Royal Institute of Technology (KTH), Lindstedtsvägen 24, 10044 Stockholm, Sweden.
J Acoust Soc Am. 2005 May;117(5):3184-95. doi: 10.1121/1.1861234.
The purpose of this study is to explore the possibility for physically based mathematical models of the voice source to accurately reproduce inverse filtered glottal volume-velocity waveforms. A low-dimensional, self-oscillating model of the glottal source with waveform-matching properties is proposed. The model relies on a lumped mechano-aerodynamic scheme loosely inspired by the one- and multimass lumped models. The vocal folds are represented by a single mechanical resonator and a propagation line which takes into account the vertical phase differences. The vocal-fold displacement is coupled to the glottal flow by means of an aerodynamic driving block which includes a general parametric nonlinear component. The principal characteristics of the flow-induced oscillations are retained, and the overall model is able to match inverse-filtered glottal flow signals. The method offers in principle the possibility of performing transformations of the glottal flow by acting on the physiologically based parameters of the model. This is a desirable property, e.g., for speech synthesis applications. The model was tested on a data set which included inverse-filtered glottal flow waveforms of different characteristics. The results demonstrate the possibility of reproducing natural speech waveforms with high accuracy, and of controlling important characteristics of the synthesis such as pitch.
本研究的目的是探索基于物理的声源数学模型准确再现逆滤波后的声门容积速度波形的可能性。提出了一种具有波形匹配特性的低维声门源自振荡模型。该模型依赖于一种集总机械空气动力学方案,该方案大致受到单质量和多质量集总模型的启发。声带由一个单一的机械谐振器和一条考虑垂直相位差的传播线表示。声带位移通过一个包括一般参数非线性分量的空气动力学驱动模块与声门气流耦合。保留了流动诱导振荡的主要特征,并且整个模型能够匹配逆滤波后的声门气流信号。该方法原则上提供了通过作用于模型的基于生理的参数来对声门气流进行变换的可能性。这是一个理想的特性,例如对于语音合成应用。该模型在一个数据集上进行了测试,该数据集包括不同特征的逆滤波后的声门气流波形。结果证明了以高精度再现自然语音波形以及控制合成的重要特征(如音高)的可能性。