• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

好奇驱动的强化学习在仿人机器人上的运动规划。

Curiosity driven reinforcement learning for motion planning on humanoids.

机构信息

Dalle Molle Institute for Artificial Intelligence Lugano, Switzerland ; Facoltà di Scienze Informatiche, Università della Svizzera Italiana Lugano, Switzerland ; Dipartimento Tecnologie Innovative, Scuola Universitaria Professionale della Svizzera Italiana Manno, Switzerland.

出版信息

Front Neurorobot. 2014 Jan 6;7:25. doi: 10.3389/fnbot.2013.00025.

DOI:10.3389/fnbot.2013.00025
PMID:24432001
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3881010/
Abstract

Most previous work on artificial curiosity (AC) and intrinsic motivation focuses on basic concepts and theory. Experimental results are generally limited to toy scenarios, such as navigation in a simulated maze, or control of a simple mechanical system with one or two degrees of freedom. To study AC in a more realistic setting, we embody a curious agent in the complex iCub humanoid robot. Our novel reinforcement learning (RL) framework consists of a state-of-the-art, low-level, reactive control layer, which controls the iCub while respecting constraints, and a high-level curious agent, which explores the iCub's state-action space through information gain maximization, learning a world model from experience, controlling the actual iCub hardware in real-time. To the best of our knowledge, this is the first ever embodied, curious agent for real-time motion planning on a humanoid. We demonstrate that it can learn compact Markov models to represent large regions of the iCub's configuration space, and that the iCub explores intelligently, showing interest in its physical constraints as well as in objects it finds in its environment.

摘要

大多数关于人工好奇心 (AC) 和内在动机的前期工作都集中在基本概念和理论上。实验结果通常仅限于玩具场景,例如在模拟迷宫中的导航,或者控制一个具有一到两个自由度的简单机械系统。为了在更真实的环境中研究 AC,我们在复杂的 iCub 人形机器人中体现了一个好奇的代理。我们的新型强化学习 (RL) 框架由一个最先进的、低级的、反应式控制层组成,该控制层在尊重约束的情况下控制 iCub,还有一个高级好奇代理,该代理通过最大化信息增益来探索 iCub 的状态-动作空间,从经验中学习世界模型,实时控制实际的 iCub 硬件。据我们所知,这是第一个在人形机器人上进行实时运动规划的具体化、好奇的代理。我们证明它可以学习紧凑的马尔可夫模型来表示 iCub 配置空间的大部分区域,并且 iCub 能够智能地探索,对其物理约束以及在环境中发现的物体表现出兴趣。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b3c/3881010/20780f83eaa5/fnbot-07-00025-g0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b3c/3881010/f68d4afc205c/fnbot-07-00025-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b3c/3881010/20e2e2ad5a9b/fnbot-07-00025-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b3c/3881010/b54a9ca83fc4/fnbot-07-00025-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b3c/3881010/a9e5848b2525/fnbot-07-00025-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b3c/3881010/44e4d74eacc2/fnbot-07-00025-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b3c/3881010/10bd48f975b2/fnbot-07-00025-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b3c/3881010/756988320d71/fnbot-07-00025-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b3c/3881010/d085838ed7ae/fnbot-07-00025-g0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b3c/3881010/164fec003953/fnbot-07-00025-g0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b3c/3881010/20780f83eaa5/fnbot-07-00025-g0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b3c/3881010/f68d4afc205c/fnbot-07-00025-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b3c/3881010/20e2e2ad5a9b/fnbot-07-00025-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b3c/3881010/b54a9ca83fc4/fnbot-07-00025-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b3c/3881010/a9e5848b2525/fnbot-07-00025-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b3c/3881010/44e4d74eacc2/fnbot-07-00025-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b3c/3881010/10bd48f975b2/fnbot-07-00025-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b3c/3881010/756988320d71/fnbot-07-00025-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b3c/3881010/d085838ed7ae/fnbot-07-00025-g0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b3c/3881010/164fec003953/fnbot-07-00025-g0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2b3c/3881010/20780f83eaa5/fnbot-07-00025-g0010.jpg

相似文献

1
Curiosity driven reinforcement learning for motion planning on humanoids.好奇驱动的强化学习在仿人机器人上的运动规划。
Front Neurorobot. 2014 Jan 6;7:25. doi: 10.3389/fnbot.2013.00025.
2
An On-chip Spiking Neural Network for Estimation of the Head Pose of the iCub Robot.一种用于估计iCub机器人头部姿态的片上脉冲神经网络。
Front Neurosci. 2020 Jun 23;14:551. doi: 10.3389/fnins.2020.00551. eCollection 2020.
3
A Socially Adaptable Framework for Human-Robot Interaction.一种用于人机交互的社会适应性框架。
Front Robot AI. 2020 Oct 19;7:121. doi: 10.3389/frobt.2020.00121. eCollection 2020.
4
iCub-HRI: A Software Framework for Complex Human-Robot Interaction Scenarios on the iCub Humanoid Robot.iCub-HRI:用于iCub人形机器人复杂人机交互场景的软件框架。
Front Robot AI. 2018 Mar 12;5:22. doi: 10.3389/frobt.2018.00022. eCollection 2018.
5
Grounding Action Words in the Sensorimotor Interaction with the World: Experiments with a Simulated iCub Humanoid Robot.将动作词与与世界的感知运动交互联系起来:使用模拟的 iCub 人形机器人进行的实验。
Front Neurorobot. 2010 May 31;4. doi: 10.3389/fnbot.2010.00007. eCollection 2010.
6
Task Feasibility Maximization Using Model-Free Policy Search and Model-Based Whole-Body Control.使用无模型策略搜索和基于模型的全身控制实现任务可行性最大化
Front Robot AI. 2020 Jun 4;7:61. doi: 10.3389/frobt.2020.00061. eCollection 2020.
7
Confidence-based progress-driven self-generated goals for skill acquisition in developmental robots.基于置信度的驱动自生成目标的发展机器人技能获取。
Front Psychol. 2013 Nov 26;4:833. doi: 10.3389/fpsyg.2013.00833. eCollection 2013.
8
Optimization-Based Controllers for Robotics Applications (OCRA): The Case of iCub's Whole-Body Control.用于机器人应用的基于优化的控制器(OCRA):以iCub的全身控制为例。
Front Robot AI. 2018 Mar 29;5:24. doi: 10.3389/frobt.2018.00024. eCollection 2018.
9
RL-DOVS: Reinforcement Learning for Autonomous Robot Navigation in Dynamic Environments.RL-DOVS:动态环境下自主机器人导航的强化学习。
Sensors (Basel). 2022 May 19;22(10):3847. doi: 10.3390/s22103847.
10
Markerless Eye-Hand Kinematic Calibration on the iCub Humanoid Robot.iCub人形机器人上的无标记眼手运动学校准
Front Robot AI. 2018 Jun 12;5:46. doi: 10.3389/frobt.2018.00046. eCollection 2018.

引用本文的文献

1
The promise and peril of interactive embodied agents for studying non-verbal communication: a machine learning perspective.交互式具身代理在研究非言语交流方面的承诺与危险:机器学习视角。
Philos Trans R Soc Lond B Biol Sci. 2023 Apr 24;378(1875):20210475. doi: 10.1098/rstb.2021.0475. Epub 2023 Mar 6.
2
Intrinsic motivation learning for real robot applications.用于实际机器人应用的内在动机学习。
Front Robot AI. 2023 Feb 10;10:1102438. doi: 10.3389/frobt.2023.1102438. eCollection 2023.
3
Adaptive Discount Factor for Deep Reinforcement Learning in Continuing Tasks with Uncertainty.

本文引用的文献

1
PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem.PowerPlay:通过不断寻找最简单但仍未解决的问题,训练一个越来越通用的问题解决者。
Front Psychol. 2013 Jun 7;4:313. doi: 10.3389/fpsyg.2013.00313. eCollection 2013.
2
First experiments with POWERPLAY.首次 POWERPLAY 实验。
Neural Netw. 2013 May;41:130-6. doi: 10.1016/j.neunet.2013.01.022. Epub 2013 Feb 10.
3
Learning tactile skills through curious exploration.通过好奇探索学习触觉技能。
具有不确定性的持续任务中用于深度强化学习的自适应折扣因子。
Sensors (Basel). 2022 Sep 25;22(19):7266. doi: 10.3390/s22197266.
4
Grounding Context in Embodied Cognitive Robotics.具身认知机器人中的情境基础
Front Neurorobot. 2022 Jun 15;16:843108. doi: 10.3389/fnbot.2022.843108. eCollection 2022.
5
A Survey on AI-Driven Digital Twins in Industry 4.0: Smart Manufacturing and Advanced Robotics.人工智能驱动的工业 4.0 数字孪生体调查:智能制造与先进机器人。
Sensors (Basel). 2021 Sep 23;21(19):6340. doi: 10.3390/s21196340.
6
Toward Computational Motivation for Multi-Agent Systems and Swarms.迈向多智能体系统和群体的计算动机
Front Robot AI. 2018 Dec 18;5:134. doi: 10.3389/frobt.2018.00134. eCollection 2018.
7
Curiosity-based learning in infants: a neurocomputational approach.基于好奇心的婴儿学习:一种神经计算方法。
Dev Sci. 2018 Jul;21(4):e12629. doi: 10.1111/desc.12629. Epub 2017 Oct 26.
8
Closed-loop Robots Driven by Short-Term Synaptic Plasticity: Emergent Explorative vs. Limit-Cycle Locomotion.由短期突触可塑性驱动的闭环机器人:涌现式探索与极限环运动
Front Neurorobot. 2016 Oct 18;10:12. doi: 10.3389/fnbot.2016.00012. eCollection 2016.
9
Novel plasticity rule can explain the development of sensorimotor intelligence.新型可塑性规则能够解释感觉运动智能的发展。
Proc Natl Acad Sci U S A. 2015 Nov 10;112(45):E6224-32. doi: 10.1073/pnas.1508400112. Epub 2015 Oct 26.
10
A neural network-based exploratory learning and motor planning system for co-robots.一种用于协作机器人的基于神经网络的探索性学习与运动规划系统。
Front Neurorobot. 2015 Jul 23;9:7. doi: 10.3389/fnbot.2015.00007. eCollection 2015.
Front Neurorobot. 2012 Jul 23;6:6. doi: 10.3389/fnbot.2012.00006. eCollection 2012.
4
Artificial intelligence. Autonomous mental development by robots and animals.人工智能。机器人和动物的自主心智发展。
Science. 2001 Jan 26;291(5504):599-600. doi: 10.1126/science.291.5504.599.