Cong Lin, Liang Hongzhuo, Ruppel Philipp, Shi Yunlei, Görner Michael, Hendrich Norman, Zhang Jianwei
TAMS Group, Department of Informatics, Universität Hamburg, Hamburg, Germany.
Front Neurorobot. 2022 Mar 2;16:829437. doi: 10.3389/fnbot.2022.829437. eCollection 2022.
We propose a vision-proprioception model for planar object pushing, efficiently integrating all necessary information from the environment. A Variational Autoencoder (VAE) is used to extract compact representations from the task-relevant part of the image. With the real-time robot state obtained easily from the hardware system, we fuse the latent representations from the VAE and the robot end-effector position together as the state of a Markov Decision Process. We use Soft Actor-Critic to train the robot to push different objects from random initial poses to target positions in simulation. Hindsight Experience replay is applied during the training process to improve the sample efficiency. Experiments demonstrate that our algorithm achieves a pushing performance superior to a state-based baseline model that cannot be generalized to a different object and outperforms state-of-the-art policies which operate on raw image observations. At last, we verify that our trained model has a good generalization ability to unseen objects in the real world.
我们提出了一种用于平面物体推操作的视觉-本体感觉模型,该模型能够有效地整合来自环境的所有必要信息。变分自编码器(VAE)用于从图像中与任务相关的部分提取紧凑表示。借助从硬件系统轻松获得的实时机器人状态,我们将来自VAE的潜在表示与机器人末端执行器位置融合在一起,作为马尔可夫决策过程的状态。我们使用软演员-评论家算法在模拟中训练机器人将不同物体从随机初始姿态推到目标位置。在训练过程中应用事后经验回放来提高样本效率。实验表明,我们的算法实现了优于基于状态的基线模型的推操作性能,该基线模型无法推广到不同物体,并且优于基于原始图像观测操作的现有最优策略。最后,我们验证了我们训练的模型对现实世界中未见过的物体具有良好的泛化能力。