使用强化学习将意义根植于感觉运动行为中。

Grounding the Meanings in Sensorimotor Behavior using Reinforcement Learning.

机构信息

Department of Applied Informatics, Comenius University in Bratislava Bratislava, Slovakia.

出版信息

Front Neurorobot. 2012 Feb 29;6:1. doi: 10.3389/fnbot.2012.00001. eCollection 2012.

DOI:10.3389/fnbot.2012.00001

PMID:22393319

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3289932/

Abstract

The recent outburst of interest in cognitive developmental robotics is fueled by the ambition to propose ecologically plausible mechanisms of how, among other things, a learning agent/robot could ground linguistic meanings in its sensorimotor behavior. Along this stream, we propose a model that allows the simulated iCub robot to learn the meanings of actions (point, touch, and push) oriented toward objects in robot's peripersonal space. In our experiments, the iCub learns to execute motor actions and comment on them. Architecturally, the model is composed of three neural-network-based modules that are trained in different ways. The first module, a two-layer perceptron, is trained by back-propagation to attend to the target position in the visual scene, given the low-level visual information and the feature-based target information. The second module, having the form of an actor-critic architecture, is the most distinguishing part of our model, and is trained by a continuous version of reinforcement learning to execute actions as sequences, based on a linguistic command. The third module, an echo-state network, is trained to provide the linguistic description of the executed actions. The trained model generalizes well in case of novel action-target combinations with randomized initial arm positions. It can also promptly adapt its behavior if the action/target suddenly changes during motor execution.

摘要

最近，人们对认知发展机器人学的兴趣大增，其动机是提出一个在生态上合理的机制，说明学习代理/机器人如何将语言意义建立在其感觉运动行为基础上。沿着这条思路，我们提出了一个模型，使模拟的 iCub 机器人能够学习针对机器人近体空间中的物体的动作（指向、触摸和推动）的意义。在我们的实验中，iCub 学会了执行运动动作并对其进行评论。从架构上讲，该模型由三个基于神经网络的模块组成，这些模块以不同的方式进行训练。第一个模块是一个两层感知机，通过反向传播进行训练，以便在给定低水平视觉信息和基于特征的目标信息的情况下，关注视觉场景中的目标位置。第二个模块采用演员-批评家架构的形式，是我们模型的最具特色部分，通过连续强化学习进行训练，以便根据语言命令执行动作序列。第三个模块是回声状态网络，用于提供执行动作的语言描述。训练有素的模型在具有随机初始手臂位置的新动作-目标组合的情况下能够很好地泛化。如果在运动执行过程中动作/目标突然发生变化，它也可以迅速调整其行为。

相似文献

Grounding the Meanings in Sensorimotor Behavior using Reinforcement Learning.

Front Neurorobot. 2012 Feb 29;6:1. doi: 10.3389/fnbot.2012.00001. eCollection 2012.

Grounding Action Words in the Sensorimotor Interaction with the World: Experiments with a Simulated iCub Humanoid Robot.

Front Neurorobot. 2010 May 31;4. doi: 10.3389/fnbot.2010.00007. eCollection 2010.

iCub-HRI: A Software Framework for Complex Human-Robot Interaction Scenarios on the iCub Humanoid Robot.

Front Robot AI. 2018 Mar 12;5:22. doi: 10.3389/frobt.2018.00022. eCollection 2018.

Learning Actions From Natural Language Instructions Using an ON-World Embodied Cognitive Architecture.

Front Neurorobot. 2021 May 13;15:626380. doi: 10.3389/fnbot.2021.626380. eCollection 2021.

The grounding of higher order concepts in action and language: a cognitive robotics model.

Neural Netw. 2012 Aug;32:165-73. doi: 10.1016/j.neunet.2012.02.012. Epub 2012 Feb 14.

Learning reaching strategies through reinforcement for a sensor-based manipulator.

Neural Netw. 1998 Mar 31;11(2):359-76. doi: 10.1016/s0893-6080(97)00137-8.

Exploring the acquisition and production of grammatical constructions through human-robot interaction with echo state networks.

Front Neurorobot. 2014 May 6;8:16. doi: 10.3389/fnbot.2014.00016. eCollection 2014.

An On-chip Spiking Neural Network for Estimation of the Head Pose of the iCub Robot.

Front Neurosci. 2020 Jun 23;14:551. doi: 10.3389/fnins.2020.00551. eCollection 2020.

Sensorimotor coordination in a "baby" robot: learning about objects through grasping.

Prog Brain Res. 2007;164:403-24. doi: 10.1016/S0079-6123(07)64022-9.

An embodied model for sensorimotor grounding and grounding transfer: experiments with epigenetic robots.

Cogn Sci. 2006 Jul 8;30(4):673-89. doi: 10.1207/s15516709cog0000_72.

引用本文的文献

Humanoids Learning to Walk: A Natural CPG-Actor-Critic Architecture.

Front Neurorobot. 2013 Apr 8;7:5. doi: 10.3389/fnbot.2013.00005. eCollection 2013.

本文引用的文献

An embodied model for sensorimotor grounding and grounding transfer: experiments with epigenetic robots.

Cogn Sci. 2006 Jul 8;30(4):673-89. doi: 10.1207/s15516709cog0000_72.

Sentence processing: linking language to motor chains.

Front Neurorobot. 2010 May 28;4. doi: 10.3389/fnbot.2010.00004. eCollection 2010.

Grounding Action Words in the Sensorimotor Interaction with the World: Experiments with a Simulated iCub Humanoid Robot.

Front Neurorobot. 2010 May 31;4. doi: 10.3389/fnbot.2010.00007. eCollection 2010.

Linking language with embodied and teleological representations of action for humanoid cognition.

Front Neurorobot. 2010 Jun 3;4:8. doi: 10.3389/fnbot.2010.00008. eCollection 2010.

Processing abstract language modulates motor system activity.

Q J Exp Psychol (Hove). 2008 Jun;61(6):905-19. doi: 10.1080/17470210701625550.

Grounded cognition.

Annu Rev Psychol. 2008;59:617-45. doi: 10.1146/annurev.psych.59.103006.093639.

Coordinating perceptually grounded categories through language: a case study for colour.

Behav Brain Sci. 2005 Aug;28(4):469-89; discussion 489-529. doi: 10.1017/S0140525X05000087.

Brain mechanisms linking language and action.

Nat Rev Neurosci. 2005 Jul;6(7):576-82. doi: 10.1038/nrn1706.

Self-organization of distributedly represented multiple behavior schemata in a mirror system: reviews of robot experiments using RNNPB.

Neural Netw. 2004 Oct-Nov;17(8-9):1273-89. doi: 10.1016/j.neunet.2004.05.007.

Infant grasp learning: a computational model.

Exp Brain Res. 2004 Oct;158(4):480-503. doi: 10.1007/s00221-004-1914-1. Epub 2004 Jun 25.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用强化学习将意义根植于感觉运动行为中。

Grounding the Meanings in Sensorimotor Behavior using Reinforcement Learning.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献