• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于模仿学习解决机器人鱼姿态调节问题的研究

Leveraging Imitation Learning on Pose Regulation Problem of a Robotic Fish.

作者信息

Zhang Tianhao, Yue Lu, Wang Chen, Sun Jinan, Zhang Shikun, Wei Airong, Xie Guangming

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Mar;35(3):4232-4245. doi: 10.1109/TNNLS.2022.3202075. Epub 2024 Feb 29.

DOI:10.1109/TNNLS.2022.3202075
PMID:36070266
Abstract

In this article, the pose regulation control problem of a robotic fish is investigated by formulating it as a Markov decision process (MDP). Such a typical task that requires the robot to arrive at the desired position with the desired orientation remains a challenge, since two objectives (position and orientation) may be conflicted during optimization. To handle the challenge, we adopt the sparse reward scheme, i.e., the robot will be rewarded if and only if it completes the pose regulation task. Although deep reinforcement learning (DRL) can achieve such an MDP with sparse rewards, the absence of immediate reward hinders the robot from efficient learning. To this end, we propose a novel imitation learning (IL) method that learns DRL-based policies from demonstrations with inverse reward shaping to overcome the challenge raised by extremely sparse rewards. Moreover, we design a demonstrator to generate various trajectory demonstrations based on one simple example from a nonexpert helper, which greatly reduces the time consumption of collecting robot samples. The simulation results evaluate the effectiveness of our proposed demonstrator and the state-of-the-art (SOTA) performance of our proposed IL method. Furthermore, we deploy the trained IL policy on a physical robotic fish to perform pose regulation in a swimming tank without/with external disturbances. The experimental results verify the effectiveness and robustness of our proposed methods in real world. Therefore, we believe this article is a step forward in the field of biomimetic underwater robot learning.

摘要

在本文中,通过将机器人鱼的姿态调节控制问题表述为马尔可夫决策过程(MDP)来进行研究。这样一个要求机器人以期望的方向到达期望位置的典型任务仍然是一个挑战,因为在优化过程中两个目标(位置和方向)可能相互冲突。为了应对这一挑战,我们采用稀疏奖励方案,即机器人只有在完成姿态调节任务时才会获得奖励。尽管深度强化学习(DRL)可以实现具有稀疏奖励的此类MDP,但缺乏即时奖励阻碍了机器人的高效学习。为此,我们提出了一种新颖的模仿学习(IL)方法,该方法通过逆奖励塑造从示范中学习基于DRL的策略,以克服极其稀疏奖励带来的挑战。此外,我们设计了一个示范器,基于非专业助手的一个简单示例生成各种轨迹示范,这大大减少了收集机器人样本的时间消耗。仿真结果评估了我们提出的示范器的有效性以及我们提出的IL方法的最优性能。此外,我们将训练好的IL策略部署在物理机器人鱼上,以在无/有外部干扰的游泳槽中执行姿态调节。实验结果验证了我们提出的方法在现实世界中的有效性和鲁棒性。因此,我们相信本文在仿生水下机器人学习领域向前迈进了一步。

相似文献

1
Leveraging Imitation Learning on Pose Regulation Problem of a Robotic Fish.基于模仿学习解决机器人鱼姿态调节问题的研究
IEEE Trans Neural Netw Learn Syst. 2024 Mar;35(3):4232-4245. doi: 10.1109/TNNLS.2022.3202075. Epub 2024 Feb 29.
2
Task-Oriented Deep Reinforcement Learning for Robotic Skill Acquisition and Control.面向任务的机器人技能获取和控制的深度强化学习。
IEEE Trans Cybern. 2021 Feb;51(2):1056-1069. doi: 10.1109/TCYB.2019.2949596. Epub 2021 Jan 15.
3
Optimizing Robotic Task Sequencing and Trajectory Planning on the Basis of Deep Reinforcement Learning.基于深度强化学习优化机器人任务排序与轨迹规划
Biomimetics (Basel). 2023 Dec 27;9(1):10. doi: 10.3390/biomimetics9010010.
4
A reinforcement learning algorithm acquires demonstration from the training agent by dividing the task space.强化学习算法通过划分任务空间从训练代理那里获取演示。
Neural Netw. 2023 Jul;164:419-427. doi: 10.1016/j.neunet.2023.04.042. Epub 2023 May 5.
5
Learning of Long-Horizon Sparse-Reward Robotic Manipulator Tasks With Base Controllers.基于基础控制器学习长视野稀疏奖励机器人操纵任务
IEEE Trans Neural Netw Learn Syst. 2024 Mar;35(3):4072-4081. doi: 10.1109/TNNLS.2022.3201705. Epub 2024 Feb 29.
6
A Multitasking-Oriented Robot Arm Motion Planning Scheme Based on Deep Reinforcement Learning and Twin Synchro-Control.基于深度强化学习和双同步控制的面向多任务的机械臂运动规划方案。
Sensors (Basel). 2020 Jun 21;20(12):3515. doi: 10.3390/s20123515.
7
Human-robot skills transfer interfaces for a flexible surgical robot.用于灵活手术机器人的人机技能转移接口。
Comput Methods Programs Biomed. 2014 Sep;116(2):81-96. doi: 10.1016/j.cmpb.2013.12.015. Epub 2014 Jan 8.
8
Deep imitation learning for 3D navigation tasks.用于3D导航任务的深度模仿学习
Neural Comput Appl. 2018;29(7):389-404. doi: 10.1007/s00521-017-3241-z. Epub 2017 Dec 4.
9
Multi-Objective Optimal Trajectory Planning for Robotic Arms Using Deep Reinforcement Learning.使用深度强化学习的机械臂多目标最优轨迹规划。
Sensors (Basel). 2023 Jun 27;23(13):5974. doi: 10.3390/s23135974.
10
Koopman Operator-Based Knowledge-Guided Reinforcement Learning for Safe Human-Robot Interaction.基于库普曼算子的知识引导强化学习用于安全人机交互
Front Robot AI. 2022 Jun 16;9:779194. doi: 10.3389/frobt.2022.779194. eCollection 2022.

引用本文的文献

1
A Survey on Reinforcement Learning Methods in Bionic Underwater Robots.仿生水下机器人中强化学习方法的综述
Biomimetics (Basel). 2023 Apr 20;8(2):168. doi: 10.3390/biomimetics8020168.