Suppr超能文献

一种应用于平面机器人手臂的用于约束深度强化学习的混合模型预测控制

A Hybrid MPC for Constrained Deep Reinforcement Learning applied for Planar Robotic Arm.

作者信息

Al-Gabalawy Mostafa

机构信息

Digital Media Department, Faculty of Computers and Information Technology, Future University, Egypt.

出版信息

ISA Trans. 2021 Apr 1. doi: 10.1016/j.isatra.2021.03.046.

Abstract

Recently, deep reinforcement learning techniques have achieved tangible results for learning high dimensional control tasks. Due to the trial and error interaction, between the autonomous agent and the environment, the learning phase is unconstrained and limited to the simulator. Such exploration has an additional drawback of consuming unnecessary samples at the beginning of the learning process. Model-based algorithms, on the other hand, handle this issue by learning the dynamics of the environment. However, model-free algorithms have a higher asymptotic performance than model-based ones. The main contribution of this paper is to construct a hybrid structured algorithm from model predictive control (MPC) and deep reinforcement learning (DRL) (MPC-DRL), that makes use of the benefits of both methods, to satisfy constraint conditions throughout the learning process. The validity of the proposed approach is demonstrated by learning a reachability task. The results show complete satisfaction for the constraint condition, represented by a static obstacle, with a smaller number of samples and higher performance compared to state-of-the-art model-free algorithms.

摘要

最近,深度强化学习技术在学习高维控制任务方面取得了显著成果。由于自主智能体与环境之间的试错交互,学习阶段不受约束且仅限于模拟器。这种探索在学习过程开始时还存在消耗不必要样本的额外缺点。另一方面,基于模型的算法通过学习环境动态来处理这个问题。然而,无模型算法具有比基于模型的算法更高的渐近性能。本文的主要贡献是构建一种由模型预测控制(MPC)和深度强化学习(DRL)组成的混合结构算法(MPC-DRL),该算法利用两种方法的优点,在整个学习过程中满足约束条件。通过学习可达性任务证明了所提方法的有效性。结果表明,与最先进的无模型算法相比,以静态障碍物表示的约束条件得到了完全满足,且样本数量更少、性能更高。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验