• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

近似动态规划在非线性约束优化中的应用。

Approximate Dynamic Programming for Nonlinear-Constrained Optimizations.

出版信息

IEEE Trans Cybern. 2021 May;51(5):2419-2432. doi: 10.1109/TCYB.2019.2926248. Epub 2021 Apr 15.

DOI:10.1109/TCYB.2019.2926248
PMID:31329149
Abstract

In this paper, we study the constrained optimization problem of a class of uncertain nonlinear interconnected systems. First, we prove that the solution of the constrained optimization problem can be obtained through solving an array of optimal control problems of constrained auxiliary subsystems. Then, under the framework of approximate dynamic programming, we present a simultaneous policy iteration (SPI) algorithm to solve the Hamilton-Jacobi-Bellman equations corresponding to the constrained auxiliary subsystems. By building an equivalence relationship, we demonstrate the convergence of the SPI algorithm. Meanwhile, we implement the SPI algorithm via an actor-critic structure, where actor networks are used to approximate optimal control policies and critic networks are applied to estimate optimal value functions. By using the least squares method and the Monte Carlo integration technique together, we are able to determine the weight vectors of actor and critic networks. Finally, we validate the developed control method through the simulation of a nonlinear interconnected plant.

摘要

在本文中,我们研究了一类不确定非线性互联系统的约束优化问题。首先,我们证明通过求解约束辅助子系统的一系列最优控制问题可以得到约束优化问题的解。然后,在近似动态规划的框架下,我们提出了一种同时策略迭代(SPI)算法来求解相应的约束辅助子系统的 Hamilton-Jacobi-Bellman 方程。通过建立等价关系,我们证明了 SPI 算法的收敛性。同时,我们通过采用 actor-critic 结构来实现 SPI 算法,其中 actor 网络用于近似最优控制策略,而 critic 网络用于估计最优值函数。通过使用最小二乘法和蒙特卡罗积分技术,我们可以确定 actor 和 critic 网络的权向量。最后,我们通过对一个非线性互联植物的仿真验证了所提出的控制方法。

相似文献

1
Approximate Dynamic Programming for Nonlinear-Constrained Optimizations.近似动态规划在非线性约束优化中的应用。
IEEE Trans Cybern. 2021 May;51(5):2419-2432. doi: 10.1109/TCYB.2019.2926248. Epub 2021 Apr 15.
2
Adaptive critic designs for optimal control of uncertain nonlinear systems with unmatched interconnections.自适应 critic 设计用于不确定非线性系统的最优控制,具有不匹配的互联。
Neural Netw. 2018 Sep;105:142-153. doi: 10.1016/j.neunet.2018.05.005. Epub 2018 May 26.
3
Decentralized Event-Driven Constrained Control Using Adaptive Critic Designs.基于自适应评判设计的分布式事件驱动约束控制
IEEE Trans Neural Netw Learn Syst. 2022 Oct;33(10):5830-5844. doi: 10.1109/TNNLS.2021.3071548. Epub 2022 Oct 5.
4
Reinforcement learning for robust stabilization of nonlinear systems with asymmetric saturating actuators.用于具有非对称饱和执行器的非线性系统鲁棒镇定的强化学习
Neural Netw. 2023 Jan;158:132-141. doi: 10.1016/j.neunet.2022.11.012. Epub 2022 Nov 16.
5
Policy-Iteration-Based Finite-Horizon Approximate Dynamic Programming for Continuous-Time Nonlinear Optimal Control.基于策略迭代的连续时间非线性最优控制有限时域近似动态规划
IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):5255-5267. doi: 10.1109/TNNLS.2022.3225090. Epub 2023 Sep 1.
6
Decentralized Neurocontroller Design With Critic Learning for Nonlinear-Interconnected Systems.去中心化神经控制器设计与批评学习的非线性互联系统。
IEEE Trans Cybern. 2022 Nov;52(11):11672-11685. doi: 10.1109/TCYB.2021.3085883. Epub 2022 Oct 17.
7
Adaptive-Critic Design for Decentralized Event-Triggered Control of Constrained Nonlinear Interconnected Systems Within an Identifier-Critic Framework.标识符-评判框架下约束非线性互联系统分散事件触发控制的自适应评判设计
IEEE Trans Cybern. 2022 Aug;52(8):7478-7491. doi: 10.1109/TCYB.2020.3037321. Epub 2022 Jul 19.
8
A policy iteration approach to online optimal control of continuous-time constrained-input systems.一种连续时间约束输入系统在线最优控制的策略迭代方法。
ISA Trans. 2013 Sep;52(5):611-21. doi: 10.1016/j.isatra.2013.04.004. Epub 2013 May 24.
9
Decentralized Event-Triggered Control for a Class of Nonlinear-Interconnected Systems Using Reinforcement Learning.基于强化学习的一类非线性互联系统的分散事件触发控制
IEEE Trans Cybern. 2021 Feb;51(2):635-648. doi: 10.1109/TCYB.2019.2946122. Epub 2021 Jan 15.
10
Iterative Adaptive Dynamic Programming for Solving Unknown Nonlinear Zero-Sum Game Based on Online Data.基于在线数据的求解未知非线性零和博弈的迭代自适应动态规划
IEEE Trans Neural Netw Learn Syst. 2017 Mar;28(3):714-725. doi: 10.1109/TNNLS.2016.2561300. Epub 2016 May 27.

引用本文的文献

1
Neural Adaptive Sliding-Mode Control for Uncertain Nonlinear Systems with Disturbances Using Adaptive Dynamic Programming.基于自适应动态规划的不确定非线性干扰系统神经自适应滑模控制
Entropy (Basel). 2023 Nov 22;25(12):1570. doi: 10.3390/e25121570.