Callan D E, Kent R D, Guenther F H, Vorperian H K
ATR Human Information Processing Research Laboratories, Kyoto, Japan.
J Speech Lang Hear Res. 2000 Jun;43(3):721-36. doi: 10.1044/jslhr.4303.721.
The purpose of this article is to demonstrate that self-produced auditory feedback is sufficient to train a mapping between auditory target space and articulator space under conditions in which the structures of speech production are undergoing considerable developmental restructuring. One challenge for competing theories that propose invariant constriction targets is that it is unclear what teaching signal could specify constriction location and degree so that a mapping between constriction target space and articulator space can be learned. It is predicted that a model trained by auditory feedback will accomplish speech goals, in auditory target space, by continuously learning to use different articulator configurations to adapt to the changing acoustic properties of the vocal tract during development. The Maeda articulatory synthesis part of the DIVA neural network model (Guenther et al., 1998) was modified to reflect the development of the vocal tract by using measurements taken from MR images of children. After training, the model was able to maintain the 11 English vowel targets in auditory planning space, utilizing varying articulator configurations, despite morphological changes that occur during development. The vocal-tract constriction pattern (derived from the vocal-tract area function) as well as the formant values varied during the course of development in correspondence with morphological changes in the structures involved with speech production. Despite changes in the acoustical properties of the vocal tract that occur during the course of development, the model was able to demonstrate motor-equivalent speech production under lip-restriction conditions. The model accomplished this in a self-organizing manner even though there was no prior experience with lip restriction during training.
本文的目的是证明,在言语产生结构正在经历相当大的发育重构的情况下,自我产生的听觉反馈足以训练听觉目标空间与发音器官空间之间的映射。对于提出不变收缩目标的竞争理论而言,一个挑战在于不清楚何种教学信号能够指定收缩位置和程度,以便能够学习收缩目标空间与发音器官空间之间的映射。据预测,通过听觉反馈训练的模型将在听觉目标空间中通过持续学习使用不同的发音器官配置来适应发育过程中声道不断变化的声学特性,从而实现言语目标。DIVA神经网络模型(Guenther等人,1998)的前田发音合成部分通过使用从儿童的磁共振图像测量得到的数据进行了修改,以反映声道的发育情况。训练后,尽管在发育过程中发生了形态变化,但该模型能够利用不同的发音器官配置在听觉规划空间中维持11个英语元音目标。声道收缩模式(源自声道面积函数)以及共振峰值在发育过程中随着与言语产生相关结构的形态变化而变化。尽管在发育过程中声道的声学特性发生了变化,但该模型能够在唇部受限条件下展示出运动等效的言语产生。即使在训练期间没有唇部受限的先验经验,该模型也以自组织的方式实现了这一点。