Laurent Raphaël, Barnaud Marie-Lou, Schwartz Jean-Luc, Bessière Pierre, Diard Julien
GIPSALab, Université Grenoble Alpes.
Institut des Systèmes Intelligents et de Robotique, Sorbonne Universités, Université Pierre et Marie Curie.
Psychol Rev. 2017 Oct;124(5):572-602. doi: 10.1037/rev0000069. Epub 2017 May 4.
There is a consensus concerning the view that both auditory and motor representations intervene in the perceptual processing of speech units. However, the question of the functional role of each of these systems remains seldom addressed and poorly understood. We capitalized on the formal framework of Bayesian Programming to develop COSMO (Communicating Objects using Sensory-Motor Operations), an integrative model that allows principled comparisons of purely motor or purely auditory implementations of a speech perception task and tests the gain of efficiency provided by their Bayesian fusion. Here, we show 3 main results: (a) In a set of precisely defined "perfect conditions," auditory and motor theories of speech perception are indistinguishable; (b) When a learning process that mimics speech development is introduced into COSMO, it departs from these perfect conditions. Then auditory recognition becomes more efficient than motor recognition in dealing with learned stimuli, while motor recognition is more efficient in adverse conditions. We interpret this result as a general "auditory-narrowband versus motor-wideband" property; and (c) Simulations of plosive-vowel syllable recognition reveal possible cues from motor recognition for the invariant specification of the place of plosive articulation in context that are lacking in the auditory pathway. This provides COSMO with a second property, where auditory cues would be more efficient for vowel decoding and motor cues for plosive articulation decoding. These simulations provide several predictions, which are in good agreement with experimental data and suggest that there is natural complementarity between auditory and motor processing within a perceptuo-motor theory of speech perception. (PsycINFO Database Record
关于听觉和运动表征都参与语音单元的感知处理这一观点,存在共识。然而,这些系统各自的功能作用问题仍然很少被探讨且理解不足。我们利用贝叶斯编程的形式框架开发了COSMO(使用感觉运动操作进行对象通信),这是一个整合模型,它允许对语音感知任务的纯运动或纯听觉实现进行有原则的比较,并测试其贝叶斯融合所提供的效率提升。在此,我们展示了3个主要结果:(a)在一组精确定义的“完美条件”下,语音感知的听觉和运动理论无法区分;(b)当将模拟语音发展的学习过程引入COSMO时,它偏离了这些完美条件。此时,在处理学习到的刺激时,听觉识别比运动识别更有效,而在不利条件下运动识别更有效。我们将这一结果解释为一种普遍的“听觉窄带与运动宽带”特性;(c)爆破元音音节识别的模拟揭示了运动识别中可能存在的线索,用于在语境中对爆破音发音位置进行不变性指定,而听觉通路中缺乏这些线索。这为COSMO提供了第二个特性,即听觉线索在元音解码方面更有效,而运动线索在爆破音发音解码方面更有效。这些模拟提供了几个预测,与实验数据高度吻合,并表明在语音感知的感知运动理论中,听觉和运动处理之间存在自然的互补性。(PsycINFO数据库记录)