• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于安全强化学习的约束离散时间非线性系统最优控制

Optimal Control for Constrained Discrete-Time Nonlinear Systems Based on Safe Reinforcement Learning.

作者信息

Zhang Lingzhi, Xie Lei, Jiang Yi, Li Zhishan, Liu Xueqin, Su Hongye

出版信息

IEEE Trans Neural Netw Learn Syst. 2025 Jan;36(1):854-865. doi: 10.1109/TNNLS.2023.3326397. Epub 2025 Jan 7.

DOI:10.1109/TNNLS.2023.3326397
PMID:37906491
Abstract

The state and input constraints of nonlinear systems could greatly impede the realization of their optimal control when using reinforcement learning (RL)-based approaches since the commonly used quadratic utility functions cannot meet the requirements of solving constrained optimization problems. This article develops a novel optimal control approach for constrained discrete-time (DT) nonlinear systems based on safe RL. Specifically, a barrier function (BF) is introduced and incorporated with the value function to help transform a constrained optimization problem into an unconstrained one. Meanwhile, the minimum of such an optimization problem can be guaranteed to occur at the origin. Then a constrained policy iteration (PI) algorithm is developed to realize the optimal control of the nonlinear system and to enable the state and input constraints to be satisfied. The constrained optimal control policy and its corresponding value function are derived through the implementation of two neural networks (NNs). Performance analysis shows that the proposed control approach still retains the convergence and optimality properties of the traditional PI algorithm. Simulation results of three examples reveal its effectiveness.

摘要

当使用基于强化学习(RL)的方法时,非线性系统的状态和输入约束可能会极大地阻碍其最优控制的实现,因为常用的二次效用函数无法满足求解约束优化问题的要求。本文提出了一种基于安全强化学习的约束离散时间(DT)非线性系统的新型最优控制方法。具体而言,引入了一个障碍函数(BF)并将其与价值函数相结合,以帮助将约束优化问题转化为无约束优化问题。同时,可以保证这种优化问题的最小值出现在原点。然后,开发了一种约束策略迭代(PI)算法,以实现非线性系统的最优控制,并确保满足状态和输入约束。通过两个神经网络(NN)的实现,推导出了约束最优控制策略及其相应的价值函数。性能分析表明,所提出的控制方法仍然保留了传统PI算法的收敛性和最优性。三个例子的仿真结果证明了其有效性。

相似文献

1
Optimal Control for Constrained Discrete-Time Nonlinear Systems Based on Safe Reinforcement Learning.基于安全强化学习的约束离散时间非线性系统最优控制
IEEE Trans Neural Netw Learn Syst. 2025 Jan;36(1):854-865. doi: 10.1109/TNNLS.2023.3326397. Epub 2025 Jan 7.
2
Neural-network-based accelerated safe Q-learning for optimal control of discrete-time nonlinear systems with state constraints.
Neural Netw. 2025 Jun;186:107249. doi: 10.1016/j.neunet.2025.107249. Epub 2025 Feb 10.
3
NN-Based Reinforcement Learning Optimal Control for Inequality-Constrained Nonlinear Discrete-Time Systems With Disturbances.基于神经网络的带有干扰的不等式约束非线性离散时间系统的强化学习最优控制
IEEE Trans Neural Netw Learn Syst. 2024 Nov;35(11):15507-15516. doi: 10.1109/TNNLS.2023.3287881. Epub 2024 Oct 29.
4
Adaptive Constrained Optimal Control Design for Data-Based Nonlinear Discrete-Time Systems With Critic-Only Structure.基于数据的非线性离散时间系统的具有仅评价器结构的自适应约束最优控制设计。
IEEE Trans Neural Netw Learn Syst. 2018 Jun;29(6):2099-2111. doi: 10.1109/TNNLS.2017.2751018. Epub 2017 Oct 3.
5
Action Mapping: A Reinforcement Learning Method for Constrained-Input Systems.动作映射:一种用于约束输入系统的强化学习方法。
IEEE Trans Neural Netw Learn Syst. 2023 Oct;34(10):7145-7157. doi: 10.1109/TNNLS.2021.3138924. Epub 2023 Oct 5.
6
Reinforcement learning based adaptive optimal control for constrained nonlinear system via a novel state-dependent transformation.基于强化学习的约束非线性系统自适应最优控制:一种新型状态依赖变换方法
ISA Trans. 2023 Feb;133:29-41. doi: 10.1016/j.isatra.2022.07.006. Epub 2022 Jul 12.
7
A policy iteration approach to online optimal control of continuous-time constrained-input systems.一种连续时间约束输入系统在线最优控制的策略迭代方法。
ISA Trans. 2013 Sep;52(5):611-21. doi: 10.1016/j.isatra.2013.04.004. Epub 2013 May 24.
8
Adaptive nearly optimal control for a class of continuous-time nonaffine nonlinear systems with inequality constraints.一类具有不等式约束的连续时间非仿射非线性系统的自适应近乎最优控制
ISA Trans. 2017 Jan;66:122-133. doi: 10.1016/j.isatra.2016.10.019. Epub 2016 Nov 9.
9
Adaptive Interleaved Reinforcement Learning: Robust Stability of Affine Nonlinear Systems With Unknown Uncertainty.自适应交错强化学习:具有未知不确定性的仿射非线性系统的鲁棒稳定性
IEEE Trans Neural Netw Learn Syst. 2022 Jan;33(1):270-280. doi: 10.1109/TNNLS.2020.3027653. Epub 2022 Jan 5.
10
Reinforcement-Learning-Based Robust Controller Design for Continuous-Time Uncertain Nonlinear Systems Subject to Input Constraints.基于强化学习的输入受限连续时间不确定非线性系统鲁棒控制器设计。
IEEE Trans Cybern. 2015 Jul;45(7):1372-85. doi: 10.1109/TCYB.2015.2417170. Epub 2015 Apr 9.