用于具有参数不确定性系统的基于安全模型的强化学习

Safe Model-Based Reinforcement Learning for Systems With Parametric Uncertainties.

作者信息

Mahmud S M Nahid, Nivison Scott A, Bell Zachary I, Kamalapurkar Rushikesh

机构信息

School of Mechanical and Aerospace Engineering, Oklahoma State University, Stillwater, OK, United States.

Munitions Directorate, Air Force Research Laboratory, Eglin AFB, FL, United States.

出版信息

Front Robot AI. 2021 Dec 16;8:733104. doi: 10.3389/frobt.2021.733104. eCollection 2021.

DOI:10.3389/frobt.2021.733104

PMID:34977161

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8717089/

Abstract

Reinforcement learning has been established over the past decade as an effective tool to find optimal control policies for dynamical systems, with recent focus on approaches that guarantee safety during the learning and/or execution phases. In general, safety guarantees are critical in reinforcement learning when the system is safety-critical and/or task restarts are not practically feasible. In optimal control theory, safety requirements are often expressed in terms of state and/or control constraints. In recent years, reinforcement learning approaches that rely on persistent excitation have been combined with a barrier transformation to learn the optimal control policies under state constraints. To soften the excitation requirements, model-based reinforcement learning methods that rely on exact model knowledge have also been integrated with the barrier transformation framework. The objective of this paper is to develop safe reinforcement learning method for deterministic nonlinear systems, with parametric uncertainties in the model, to learn approximate constrained optimal policies without relying on stringent excitation conditions. To that end, a model-based reinforcement learning technique that utilizes a novel filtered concurrent learning method, along with a barrier transformation, is developed in this paper to realize simultaneous learning of unknown model parameters and approximate optimal state-constrained control policies for safety-critical systems.

摘要

在过去十年中，强化学习已成为为动态系统寻找最优控制策略的有效工具，近期的重点是确保在学习和/或执行阶段安全的方法。一般来说，当系统对安全至关重要且/或任务重启在实际中不可行时，安全保证在强化学习中至关重要。在最优控制理论中，安全要求通常根据状态和/或控制约束来表达。近年来，依赖持续激励的强化学习方法已与障碍变换相结合，以在状态约束下学习最优控制策略。为了放宽激励要求，依赖精确模型知识的基于模型的强化学习方法也已与障碍变换框架相结合。本文的目标是为具有模型参数不确定性的确定性非线性系统开发安全强化学习方法，以在不依赖严格激励条件的情况下学习近似约束最优策略。为此，本文开发了一种基于模型的强化学习技术，该技术利用一种新颖的滤波并发学习方法以及障碍变换，以实现对安全关键系统未知模型参数和近似最优状态约束控制策略的同时学习。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于具有参数不确定性系统的基于安全模型的强化学习

Safe Model-Based Reinforcement Learning for Systems With Parametric Uncertainties.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

用于具有参数不确定性系统的基于安全模型的强化学习

Safe Model-Based Reinforcement Learning for Systems With Parametric Uncertainties.

作者信息

机构信息

出版信息

相似文献

本文引用的文献