基于任务分解和特定任务奖励系统的强化学习用于高级任务自动化

Reinforcement Learning with Task Decomposition and Task-Specific Reward System for Automation of High-Level Tasks.

作者信息

Kwon Gunam, Kim Byeongjun, Kwon Nam Kyu

机构信息

Department of Electronic Engineering, Yeungnam University, Gyeongsan 38541, Republic of Korea.

出版信息

Biomimetics (Basel). 2024 Mar 26;9(4):196. doi: 10.3390/biomimetics9040196.

DOI:10.3390/biomimetics9040196

PMID:38667207

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11047822/

Abstract

This paper introduces a reinforcement learning method that leverages task decomposition and a task-specific reward system to address complex high-level tasks, such as door opening, block stacking, and nut assembly. These tasks are decomposed into various subtasks, with the grasping and putting tasks executed through single joint and gripper actions, while other tasks are trained using the SAC algorithm alongside the task-specific reward system. The task-specific reward system aims to increase the learning speed, enhance the success rate, and enable more efficient task execution. The experimental results demonstrate the efficacy of the proposed method, achieving success rates of 99.9% for door opening, 95.25% for block stacking, 80.8% for square-nut assembly, and 90.9% for round-nut assembly. Overall, this method presents a promising solution to address the challenges associated with complex tasks, offering improvements over the traditional end-to-end approach.

摘要

本文介绍了一种强化学习方法，该方法利用任务分解和特定任务奖励系统来处理复杂的高级任务，如开门、积木堆叠和螺母装配。这些任务被分解为各种子任务，抓取和放置任务通过单个关节和夹爪动作执行，而其他任务则使用SAC算法和特定任务奖励系统进行训练。特定任务奖励系统旨在提高学习速度、提高成功率并实现更高效的任务执行。实验结果证明了所提方法的有效性，开门成功率达到99.9%，积木堆叠成功率达到95.25%，方螺母装配成功率达到80.8%，圆螺母装配成功率达到90.9%。总体而言，该方法为应对复杂任务带来的挑战提供了一个有前景的解决方案，相较于传统的端到端方法有改进。