Story Brad H, Bunton Kate
Speech Acoustics Laboratory, Department of Speech, Language, and Hearing Sciences, University of Arizona, P.O. Box 210071, Tucson, AZ 85721.
Speech Commun. 2017 Mar;87:1-17. doi: 10.1016/j.specom.2016.12.001. Epub 2016 Dec 9.
The purpose of this study was to further develop a multi-tier model of the vocal tract area function in which the modulations of shape to produce speech are generated by the product of a vowel substrate and a consonant superposition function. The new approach consists of specifying input parameters for a target consonant as a set of directional changes in the resonance frequencies of the vowel substrate. Using calculations of acoustic sensitivity functions, these "resonance deflection patterns" are transformed into time-varying deformations of the vocal tract shape without any direct specification of location or extent of the consonant constriction along the vocal tract. The configuration of the constrictions and expansions that are generated by this process were shown to be physiologically-realistic and produce speech sounds that are easily identifiable as the target consonants. This model is a useful enhancement for area function-based synthesis and can serve as a tool for understanding how the vocal tract is shaped by a talker during speech production.
本研究的目的是进一步开发一种声道区域功能的多层模型,其中产生语音的形状调制由元音基质和辅音叠加函数的乘积生成。新方法包括将目标辅音的输入参数指定为元音基质共振频率的一组方向变化。利用声学灵敏度函数的计算,这些“共振偏转模式”被转换为声道形状随时间变化的变形,而无需直接指定辅音收缩沿声道的位置或范围。通过该过程产生的收缩和扩张配置被证明在生理上是现实的,并产生易于识别为目标辅音的语音。该模型是基于区域功能合成的有用增强,可作为理解说话者在语音产生过程中如何塑造声道的工具。