RBCS - Robotics, Brain and Cognitive Sciences Department, IIT - Istituto Italiano di Tecnologia.
Top Cogn Sci. 2014 Jul;6(3):461-75. doi: 10.1111/tops.12095. Epub 2014 Jun 17.
Action perception and recognition are core abilities fundamental for human social interaction. A parieto-frontal network (the mirror neuron system) matches visually presented biological motion information onto observers' motor representations. This process of matching the actions of others onto our own sensorimotor repertoire is thought to be important for action recognition, providing a non-mediated "motor perception" based on a bidirectional flow of information along the mirror parieto-frontal circuits. State-of-the-art machine learning strategies for hand action identification have shown better performances when sensorimotor data, as opposed to visual information only, are available during learning. As speech is a particular type of action (with acoustic targets), it is expected to activate a mirror neuron mechanism. Indeed, in speech perception, motor centers have been shown to be causally involved in the discrimination of speech sounds. In this paper, we review recent neurophysiological and machine learning-based studies showing (a) the specific contribution of the motor system to speech perception and (b) that automatic phone recognition is significantly improved when motor data are used during training of classifiers (as opposed to learning from purely auditory data).
动作感知和识别是人类社会互动的核心能力。顶叶-额叶网络(镜像神经元系统)将视觉呈现的生物运动信息与观察者的运动表现相匹配。这种将他人的动作与我们自己的感觉运动库相匹配的过程被认为对动作识别很重要,它提供了一种基于镜像顶叶-额叶回路双向信息流动的非中介“运动感知”。在学习过程中,当可利用感觉运动数据而不是仅视觉信息时,用于手部动作识别的最新机器学习策略表现出更好的性能。由于语音是一种特殊类型的动作(具有声学目标),因此预计会激活镜像神经元机制。实际上,在语音感知中,运动中心已被证明在语音声音的辨别中起因果作用。在本文中,我们回顾了最近的神经生理学和基于机器学习的研究,这些研究表明:(a)运动系统对语音感知的特定贡献;(b)在分类器的训练过程中使用运动数据(而不是仅从听觉数据中学习)可显著提高自动电话识别的性能。