通过人在回路强化学习实现精确且灵活的机器人操作。

Precise and dexterous robotic manipulation via human-in-the-loop reinforcement learning.

作者信息

Luo Jianlan, Xu Charles, Wu Jeffrey, Levine Sergey

机构信息

Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA 94720, USA.

出版信息

Sci Robot. 2025 Aug 20;10(105):eads5033. doi: 10.1126/scirobotics.ads5033.

DOI:10.1126/scirobotics.ads5033

PMID:40834062

Abstract

Robotic manipulation remains one of the most difficult challenges in robotics, with approaches ranging from classical model-based control to modern imitation learning. Although these methods have enabled substantial progress, they often require extensive manual design, struggle with performance, and demand large-scale data collection. These limitations hinder their real-world deployment at scale, where reliability, speed, and robustness are essential. Reinforcement learning (RL) offers a powerful alternative by enabling robots to autonomously acquire complex manipulation skills through interaction. However, realizing the full potential of RL in the real world remains challenging because of issues of sample efficiency and safety. We present a human-in-the-loop, vision-based RL system that achieved strong performance on a wide range of dexterous manipulation tasks, including precise assembly, dynamic manipulation, and dual-arm coordination. These tasks reflect realistic industrial tolerances, with small but critical variations in initial object placements that demand sophisticated reactive control. Our method integrates demonstrations, human corrections, sample-efficient RL algorithms, and system-level design to directly learn RL policies in the real world. Within 1 to 2.5 hours of real-world training, our approach outperformed other baselines by improving task success by 2×, achieving near-perfect success rates, and executing 1.8× faster on average. Through extensive experiments and analysis, our results suggest that RL can learn a wide range of complex vision-based manipulation policies directly in the real world within practical training times. We hope that this work will inspire a new generation of learned robotic manipulation techniques, benefiting both industrial applications and research advancements.

摘要

机器人操作仍然是机器人技术中最具挑战性的难题之一，其方法涵盖从基于经典模型的控制到现代模仿学习。尽管这些方法取得了显著进展，但它们通常需要大量的人工设计，在性能方面存在困难，并且需要大规模的数据收集。这些限制阻碍了它们在现实世界中的大规模部署，而在现实世界中，可靠性、速度和鲁棒性至关重要。强化学习（RL）提供了一种强大的替代方案，通过使机器人能够通过交互自主获取复杂的操作技能。然而，由于样本效率和安全问题，在现实世界中充分发挥强化学习的潜力仍然具有挑战性。我们提出了一种基于视觉的人在回路强化学习系统，该系统在广泛的灵巧操作任务上取得了优异的性能，包括精确装配、动态操作和双臂协调。这些任务反映了现实的工业公差，初始物体放置存在微小但关键的变化，需要复杂的反应控制。我们的方法集成了示范、人工校正、样本高效的强化学习算法和系统级设计，以在现实世界中直接学习强化学习策略。在1到2.5小时的现实世界训练中，我们的方法通过将任务成功率提高2倍、实现近乎完美的成功率以及平均执行速度提高1.8倍，超过了其他基线方法。通过广泛的实验和分析，我们的结果表明，强化学习可以在实际训练时间内在现实世界中直接学习各种复杂的基于视觉的操作策略。我们希望这项工作将激发新一代的学习型机器人操作技术，造福工业应用和研究进展。