用于机器人平面推操作的视觉-本体感觉模型强化学习

Reinforcement Learning With Vision-Proprioception Model for Robot Planar Pushing.

作者信息

Cong Lin, Liang Hongzhuo, Ruppel Philipp, Shi Yunlei, Görner Michael, Hendrich Norman, Zhang Jianwei

机构信息

TAMS Group, Department of Informatics, Universität Hamburg, Hamburg, Germany.

出版信息

Front Neurorobot. 2022 Mar 2;16:829437. doi: 10.3389/fnbot.2022.829437. eCollection 2022.

DOI:10.3389/fnbot.2022.829437

PMID:35308311

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8926160/

Abstract

We propose a vision-proprioception model for planar object pushing, efficiently integrating all necessary information from the environment. A Variational Autoencoder (VAE) is used to extract compact representations from the task-relevant part of the image. With the real-time robot state obtained easily from the hardware system, we fuse the latent representations from the VAE and the robot end-effector position together as the state of a Markov Decision Process. We use Soft Actor-Critic to train the robot to push different objects from random initial poses to target positions in simulation. Hindsight Experience replay is applied during the training process to improve the sample efficiency. Experiments demonstrate that our algorithm achieves a pushing performance superior to a state-based baseline model that cannot be generalized to a different object and outperforms state-of-the-art policies which operate on raw image observations. At last, we verify that our trained model has a good generalization ability to unseen objects in the real world.

摘要

我们提出了一种用于平面物体推操作的视觉-本体感觉模型，该模型能够有效地整合来自环境的所有必要信息。变分自编码器（VAE）用于从图像中与任务相关的部分提取紧凑表示。借助从硬件系统轻松获得的实时机器人状态，我们将来自VAE的潜在表示与机器人末端执行器位置融合在一起，作为马尔可夫决策过程的状态。我们使用软演员-评论家算法在模拟中训练机器人将不同物体从随机初始姿态推到目标位置。在训练过程中应用事后经验回放来提高样本效率。实验表明，我们的算法实现了优于基于状态的基线模型的推操作性能，该基线模型无法推广到不同物体，并且优于基于原始图像观测操作的现有最优策略。最后，我们验证了我们训练的模型对现实世界中未见过的物体具有良好的泛化能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b92d/8926160/a6d9cec2d707/fnbot-16-829437-g0001.jpg

相似文献

Reinforcement Learning With Vision-Proprioception Model for Robot Planar Pushing.

Front Neurorobot. 2022 Mar 2;16:829437. doi: 10.3389/fnbot.2022.829437. eCollection 2022.

Learning adaptive reaching and pushing skills using contact information.

Front Neurorobot. 2023 Sep 14;17:1271607. doi: 10.3389/fnbot.2023.1271607. eCollection 2023.

Deep reinforcement learning navigation via decision transformer in autonomous driving.

Front Neurorobot. 2024 Mar 19;18:1338189. doi: 10.3389/fnbot.2024.1338189. eCollection 2024.

Sim-to-real latent prediction: Transferring visual non-prehensile manipulation policies.

Front Robot AI. 2023 Jan 12;9:1067502. doi: 10.3389/frobt.2022.1067502. eCollection 2022.

Path Planning for Multi-Arm Manipulators Using Deep Reinforcement Learning: Soft Actor-Critic with Hindsight Experience Replay.

Sensors (Basel). 2020 Oct 19;20(20):5911. doi: 10.3390/s20205911.

A Path-Planning Method Based on Improved Soft Actor-Critic Algorithm for Mobile Robots.

Biomimetics (Basel). 2023 Oct 10;8(6):481. doi: 10.3390/biomimetics8060481.

Policy-Gradient and Actor-Critic Based State Representation Learning for Safe Driving of Autonomous Vehicles.

Sensors (Basel). 2020 Oct 22;20(21):5991. doi: 10.3390/s20215991.

Context-Based Meta-Reinforcement Learning With Bayesian Nonparametric Models.

IEEE Trans Pattern Anal Mach Intell. 2024 Oct;46(10):6948-6965. doi: 10.1109/TPAMI.2024.3386780. Epub 2024 Sep 5.

Target Tracking Control of a Biomimetic Underwater Vehicle Through Deep Reinforcement Learning.

IEEE Trans Neural Netw Learn Syst. 2022 Aug;33(8):3741-3752. doi: 10.1109/TNNLS.2021.3054402. Epub 2022 Aug 3.

A priority experience replay actor-critic algorithm using self-attention mechanism for strategy optimization of discrete problems.

PeerJ Comput Sci. 2024 Jun 28;10:e2161. doi: 10.7717/peerj-cs.2161. eCollection 2024.

引用本文的文献

Learning adaptive reaching and pushing skills using contact information.

Front Neurorobot. 2023 Sep 14;17:1271607. doi: 10.3389/fnbot.2023.1271607. eCollection 2023.

A Survey on Deep Reinforcement Learning Algorithms for Robotic Manipulation.

Sensors (Basel). 2023 Apr 5;23(7):3762. doi: 10.3390/s23073762.

Reinforcement learning based variable damping control of wearable robotic limbs for maintaining astronaut pose during extravehicular activity.

Front Neurorobot. 2023 Feb 15;17:1093718. doi: 10.3389/fnbot.2023.1093718. eCollection 2023.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于机器人平面推操作的视觉-本体感觉模型强化学习

Reinforcement Learning With Vision-Proprioception Model for Robot Planar Pushing.

作者信息

Cong Lin, Liang Hongzhuo, Ruppel Philipp, Shi Yunlei, Görner Michael, Hendrich Norman, Zhang Jianwei

机构信息

TAMS Group, Department of Informatics, Universität Hamburg, Hamburg, Germany.

出版信息

Front Neurorobot. 2022 Mar 2;16:829437. doi: 10.3389/fnbot.2022.829437. eCollection 2022.

DOI:10.3389/fnbot.2022.829437

PMID:35308311

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8926160/

Abstract

摘要

用于机器人平面推操作的视觉-本体感觉模型强化学习

Reinforcement Learning With Vision-Proprioception Model for Robot Planar Pushing.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

用于机器人平面推操作的视觉-本体感觉模型强化学习

Reinforcement Learning With Vision-Proprioception Model for Robot Planar Pushing.

作者信息

机构信息

出版信息