Guenther F H
Boston University, Center for Adaptive Systems, MA 02215.
Biol Cybern. 1994;72(1):43-53. doi: 10.1007/BF00206237.
This article describes a neural network model that addresses the acquisition of speaking skills by infants and subsequent motor equivalent production of speech sounds. The model learns two mappings during a babbling phase. A phonetic-to-orosensory mapping specifies a vocal tract target for each speech sound; these targets take the form of convex regions in orosensory coordinates defining the shape of the vocal tract. The babbling process wherein these convex region targets are formed explains how an infant can learn phoneme-specific and language-specific limits on acceptable variability of articulator movements. The model also learns an orosensory-to-articulatory mapping wherein cells coding desired movement directions in orosensory space learn articulator movements that achieve these orosensory movement directions. The resulting mapping provides a natural explanation for the formation of coordinative structures. This mapping also makes efficient use of redundancy in the articulator system, thereby providing the model with motor equivalent capabilities. Simulations verify the model's ability to compensate for constraints or perturbations applied to the articulators automatically and without new learning and to explain contextual variability seen in human speech production.
本文描述了一种神经网络模型,该模型用于解决婴儿口语技能的习得以及随后语音的运动等效产生问题。该模型在咿呀学语阶段学习两种映射。一种语音到口部感觉的映射为每个语音指定一个声道目标;这些目标以口部感觉坐标中的凸区域形式呈现,定义了声道的形状。形成这些凸区域目标的咿呀学语过程解释了婴儿如何学习发音动作可接受变异性的音素特定和语言特定限制。该模型还学习一种口部感觉到手部动作的映射,其中在口部感觉空间中编码期望运动方向的细胞学习实现这些口部感觉运动方向的发音动作。由此产生的映射为协调结构的形成提供了自然的解释。这种映射还有效利用了发音系统中的冗余,从而为模型提供了运动等效能力。模拟验证了该模型自动补偿施加于发音器官的约束或扰动而无需新学习的能力,以及解释人类语音产生中所见上下文变异性的能力。