Suppr超能文献

基于策略迭代的连续时间非线性最优控制有限时域近似动态规划

Policy-Iteration-Based Finite-Horizon Approximate Dynamic Programming for Continuous-Time Nonlinear Optimal Control.

作者信息

Lin Ziyu, Duan Jingliang, Li Shengbo Eben, Ma Haitong, Li Jie, Chen Jianyu, Cheng Bo, Ma Jun

出版信息

IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):5255-5267. doi: 10.1109/TNNLS.2022.3225090. Epub 2023 Sep 1.

Abstract

The Hamilton-Jacobi-Bellman (HJB) equation serves as the necessary and sufficient condition for the optimal solution to the continuous-time (CT) optimal control problem (OCP). Compared with the infinite-horizon HJB equation, the solving of the finite-horizon (FH) HJB equation has been a long-standing challenge, because the partial time derivative of the value function is involved as an additional unknown term. To address this problem, this study first-time bridges the link between the partial time derivative and the terminal-time utility function, and thus it facilitates the use of the policy iteration (PI) technique to solve the CT FH OCPs. Based on this key finding, the FH approximate dynamic programming (ADP) algorithm is proposed leveraging an actor-critic framework. It is shown that the algorithm exhibits important properties in terms of convergence and optimality. Rather importantly, with the use of multilayer neural networks (NNs) in the actor-critic architecture, the algorithm is suitable for CT FH OCPs toward more general nonlinear and complex systems. Finally, the effectiveness of the proposed algorithm is demonstrated by conducting a series of simulations on both a linear quadratic regulator (LQR) problem and a nonlinear vehicle tracking problem.

摘要

哈密顿 - 雅可比 - 贝尔曼(HJB)方程是连续时间(CT)最优控制问题(OCP)最优解的充要条件。与无限时域HJB方程相比,有限时域(FH)HJB方程的求解一直是一个长期存在的挑战,因为价值函数的偏时间导数作为一个额外的未知项被涉及。为了解决这个问题,本研究首次在偏时间导数和终端时间效用函数之间建立了联系,从而便于使用策略迭代(PI)技术来求解CT FH OCP。基于这一关键发现,利用演员 - 评论家框架提出了FH近似动态规划(ADP)算法。结果表明,该算法在收敛性和最优性方面具有重要性质。更重要的是,通过在演员 - 评论家架构中使用多层神经网络(NN),该算法适用于更一般的非线性和复杂系统的CT FH OCP。最后,通过对线性二次调节器(LQR)问题和非线性车辆跟踪问题进行一系列仿真,验证了所提算法的有效性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验