Suppr超能文献

从模拟到现实的潜在预测:转移视觉非抓取式操纵策略。

Sim-to-real latent prediction: Transferring visual non-prehensile manipulation policies.

作者信息

Rizzardo Carlo, Chen Fei, Caldwell Darwin

机构信息

Active Perception and Robot Interactive Learning Laboratory, Advanced Robotics, Istituto Italiano di Tecnologia, Genova, Italy.

Department of Mechanical and Automation Engineering, T-Stone Robotics Institute, The Chinese University of Hong Kong, Hong Kong, China.

出版信息

Front Robot AI. 2023 Jan 12;9:1067502. doi: 10.3389/frobt.2022.1067502. eCollection 2022.

Abstract

Reinforcement Learning has been shown to have a great potential for robotics. It demonstrated the capability to solve complex manipulation and locomotion tasks, even by learning end-to-end policies that operate directly on visual input, removing the need for custom perception systems. However, for practical robotics applications, its scarce sample efficiency, the need for huge amounts of resources, data, and computation time can be an insurmountable obstacle. One potential solution to this sample efficiency issue is the use of simulated environments. However, the discrepancy in visual and physical characteristics between reality and simulation, namely the sim-to-real gap, often significantly reduces the real-world performance of policies trained within a simulator. In this work we propose a sim-to-real technique that trains a Soft-Actor Critic agent together with a decoupled feature extractor and a latent-space dynamics model. The decoupled nature of the method allows to independently perform the sim-to-real transfer of feature extractor and control policy, and the presence of the dynamics model acts as a constraint on the latent representation when finetuning the feature extractor on real-world data. We show how this architecture can allow the transfer of a trained agent from simulation to reality without retraining or finetuning the control policy, but using real-world data only for adapting the feature extractor. By avoiding training the control policy in the real domain we overcome the need to apply Reinforcement Learning on real-world data, instead, we only focus on the unsupervised training of the feature extractor, considerably reducing real-world experience collection requirements. We evaluate the method on sim-to-sim and sim-to-real transfer of a policy for table-top robotic object pushing. We demonstrate how the method is capable of adapting to considerable variations in the task observations, such as changes in point-of-view, colors, and lighting, all while substantially reducing the training time with respect to policies trained directly in the real.

摘要

强化学习已被证明在机器人技术方面具有巨大潜力。它展示了解决复杂操作和运动任务的能力,甚至可以通过学习直接对视觉输入进行操作的端到端策略来实现,从而无需定制感知系统。然而,对于实际的机器人应用而言,其样本效率低下、需要大量资源、数据和计算时间可能是一个无法克服的障碍。解决这个样本效率问题的一个潜在方案是使用模拟环境。然而,现实与模拟之间在视觉和物理特征上的差异,即模拟到现实的差距,往往会显著降低在模拟器中训练的策略在现实世界中的性能。在这项工作中,我们提出了一种模拟到现实的技术,该技术将一个软演员评论家智能体与一个解耦特征提取器和一个潜在空间动力学模型一起训练。该方法的解耦特性允许独立地进行特征提取器和控制策略的模拟到现实的转移,并且在对现实世界数据微调特征提取器时,动力学模型的存在对潜在表示起到约束作用。我们展示了这种架构如何能够将训练好的智能体从模拟环境转移到现实环境,而无需重新训练或微调控制策略,只需使用现实世界数据来适配特征提取器。通过避免在现实领域中训练控制策略,我们克服了在现实世界数据上应用强化学习的需求,相反,我们只专注于特征提取器的无监督训练,大大减少了现实世界经验收集的要求。我们在桌面机器人对象推动策略的模拟到模拟和模拟到现实转移上评估了该方法。我们展示了该方法如何能够适应任务观测中的显著变化,例如视角、颜色和光照的变化,同时相对于直接在现实中训练的策略,大幅减少了训练时间。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a99/9879568/8ad367bcd846/frobt-09-1067502-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验