Suppr超能文献

哪种内源性动机信号最适合学习多种技能?

Which is the best intrinsic motivation signal for learning multiple skills?

机构信息

Laboratory of Computational Embodied Neuroscience, Isituto di Scienze e Tecnologie della Cognizione, Consiglio Nazionale delle Ricerche Roma, Italy ; School of Computing and Mathematics, University of Plymouth Plymouth, UK.

出版信息

Front Neurorobot. 2013 Nov 12;7:22. doi: 10.3389/fnbot.2013.00022. eCollection 2013.

Abstract

Humans and other biological agents are able to autonomously learn and cache different skills in the absence of any biological pressure or any assigned task. In this respect, Intrinsic Motivations (i.e., motivations not connected to reward-related stimuli) play a cardinal role in animal learning, and can be considered as a fundamental tool for developing more autonomous and more adaptive artificial agents. In this work, we provide an exhaustive analysis of a scarcely investigated problem: which kind of IM reinforcement signal is the most suitable for driving the acquisition of multiple skills in the shortest time? To this purpose we implemented an artificial agent with a hierarchical architecture that allows to learn and cache different skills. We tested the system in a setup with continuous states and actions, in particular, with a kinematic robotic arm that has to learn different reaching tasks. We compare the results of different versions of the system driven by several different intrinsic motivation signals. The results show (a) that intrinsic reinforcements purely based on the knowledge of the system are not appropriate to guide the acquisition of multiple skills, and (b) that the stronger the link between the IM signal and the competence of the system, the better the performance.

摘要

人类和其他生物能够在没有任何生物压力或任何指定任务的情况下自主学习和缓存不同的技能。在这方面,内在动机(即与奖励相关刺激无关的动机)在动物学习中起着至关重要的作用,并且可以被视为开发更自主和更适应的人工智能代理的基本工具。在这项工作中,我们对一个研究甚少的问题进行了详尽的分析:哪种 IM 强化信号最适合在最短的时间内驱动多种技能的获取?为此,我们实现了一个具有分层架构的人工智能代理,允许学习和缓存不同的技能。我们在具有连续状态和动作的设置中测试了该系统,特别是使用需要学习不同到达任务的运动机器人臂。我们比较了由几种不同的内在动机信号驱动的系统的不同版本的结果。结果表明:(a)纯粹基于系统知识的内在奖励并不适合指导多种技能的获取;(b)内在激励信号与系统能力之间的联系越强,性能越好。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fa3/3824099/6437b45ff55d/fnbot-07-00022-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验