基于静态和动态事件生成器的安全间歇强化学习

Safe Intermittent Reinforcement Learning With Static and Dynamic Event Generators.

作者信息

Yang Yongliang, Vamvoudakis Kyriakos G, Modares Hamidreza, Yin Yixin, Wunsch Donald C

出版信息

IEEE Trans Neural Netw Learn Syst. 2020 Dec;31(12):5441-5455. doi: 10.1109/TNNLS.2020.2967871. Epub 2020 Nov 30.

DOI:10.1109/TNNLS.2020.2967871

PMID:32054590

Abstract

In this article, we present an intermittent framework for safe reinforcement learning (RL) algorithms. First, we develop a barrier function-based system transformation to impose state constraints while converting the original problem to an unconstrained optimization problem. Second, based on optimal derived policies, two types of intermittent feedback RL algorithms are presented, namely, a static and a dynamic one. We finally leverage an actor/critic structure to solve the problem online while guaranteeing optimality, stability, and safety. Simulation results show the efficacy of the proposed approach.

摘要

在本文中，我们提出了一种用于安全强化学习（RL）算法的间歇框架。首先，我们开发了一种基于障碍函数的系统变换，以施加状态约束，同时将原始问题转化为无约束优化问题。其次，基于最优导出策略，提出了两种类型的间歇反馈RL算法，即静态算法和动态算法。我们最终利用一个演员/评论家结构在线解决该问题，同时保证最优性、稳定性和安全性。仿真结果表明了所提方法的有效性。

相似文献

Safe Intermittent Reinforcement Learning With Static and Dynamic Event Generators.基于静态和动态事件生成器的安全间歇强化学习

IEEE Trans Neural Netw Learn Syst. 2020 Dec;31(12):5441-5455. doi: 10.1109/TNNLS.2020.2967871. Epub 2020 Nov 30.

Ensemble algorithms in reinforcement learning.强化学习中的集成算法。

IEEE Trans Syst Man Cybern B Cybern. 2008 Aug;38(4):930-6. doi: 10.1109/TSMCB.2008.920231.

A policy iteration approach to online optimal control of continuous-time constrained-input systems.一种连续时间约束输入系统在线最优控制的策略迭代方法。

ISA Trans. 2013 Sep;52(5):611-21. doi: 10.1016/j.isatra.2013.04.004. Epub 2013 May 24.

A novel approach to locomotion learning: Actor-Critic architecture using central pattern generators and dynamic motor primitives.一种新的运动学习方法：使用中央模式发生器和动态运动基元的 Actor-Critic 架构。

Front Neurorobot. 2014 Oct 2;8:23. doi: 10.3389/fnbot.2014.00023. eCollection 2014.

Optimal Control for Constrained Discrete-Time Nonlinear Systems Based on Safe Reinforcement Learning.基于安全强化学习的约束离散时间非线性系统最优控制

IEEE Trans Neural Netw Learn Syst. 2025 Jan;36(1):854-865. doi: 10.1109/TNNLS.2023.3326397. Epub 2025 Jan 7.

Learn Zero-Constraint-Violation Safe Policy in Model-Free Constrained Reinforcement Learning.在无模型约束强化学习中学习零约束违反安全策略。

IEEE Trans Neural Netw Learn Syst. 2025 Feb;36(2):2327-2341. doi: 10.1109/TNNLS.2023.3348422. Epub 2025 Feb 6.

Actor-Critic Learning Control With Regularization and Feature Selection in Policy Gradient Estimation.策略梯度估计中具有正则化和特征选择的演员-评论家学习控制

IEEE Trans Neural Netw Learn Syst. 2021 Mar;32(3):1217-1227. doi: 10.1109/TNNLS.2020.2981377. Epub 2021 Mar 1.

Reinforcement Learning Tracking Control for Robotic Manipulator With Kernel-Based Dynamic Model.基于核动态模型的机器人机械手强化学习跟踪控制

IEEE Trans Neural Netw Learn Syst. 2020 Sep;31(9):3570-3578. doi: 10.1109/TNNLS.2019.2945019. Epub 2019 Nov 1.

Barrier Lyapunov Function-Based Safe Reinforcement Learning for Autonomous Vehicles With Optimized Backstepping.基于障碍李雅普诺夫函数的自动驾驶车辆安全强化学习与优化反步控制

IEEE Trans Neural Netw Learn Syst. 2024 Feb;35(2):2066-2080. doi: 10.1109/TNNLS.2022.3186528. Epub 2024 Feb 5.

Partial Policy-Based Reinforcement Learning for Anatomical Landmark Localization in 3D Medical Images.基于部分策略的强化学习在 3D 医学图像中解剖学地标定位。

IEEE Trans Med Imaging. 2020 Apr;39(4):1245-1255. doi: 10.1109/TMI.2019.2946345. Epub 2019 Oct 9.

引用本文的文献

Reinforcement Learning-Based Decentralized Safety Control for Constrained Interconnected Nonlinear Safety-Critical Systems.基于强化学习的约束互联非线性安全关键系统的分散式安全控制

Entropy (Basel). 2023 Aug 2;25(8):1158. doi: 10.3390/e25081158.

Critic Learning-Based Safe Optimal Control for Nonlinear Systems with Asymmetric Input Constraints and Unmatched Disturbances.基于批评学习的具有非对称输入约束和不匹配干扰的非线性系统安全最优控制

Entropy (Basel). 2023 Jul 24;25(7):1101. doi: 10.3390/e25071101.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于静态和动态事件生成器的安全间歇强化学习

Safe Intermittent Reinforcement Learning With Static and Dynamic Event Generators.

作者信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献