基于神经网络的路径积分随机最优控制解决方案。

Neural Network-Based Solutions for Stochastic Optimal Control Using Path Integrals.

出版信息

IEEE Trans Neural Netw Learn Syst. 2017 Mar;28(3):534-545. doi: 10.1109/TNNLS.2016.2544787.

DOI:10.1109/TNNLS.2016.2544787

PMID:28212072

Abstract

In this paper, an offline approximate dynamic programming approach using neural networks is proposed for solving a class of finite horizon stochastic optimal control problems. There are two approaches available in the literature, one based on stochastic maximum principle (SMP) formalism and the other based on solving the stochastic Hamilton-Jacobi-Bellman (HJB) equation. However, in the presence of noise, the SMP formalism becomes complex and results in having to solve a couple of backward stochastic differential equations. Hence, current solution methodologies typically ignore the noise effect. On the other hand, the inclusion of noise in the HJB framework is very straightforward. Furthermore, the stochastic HJB equation of a control-affine nonlinear stochastic system with a quadratic control cost function and an arbitrary state cost function can be formulated as a path integral (PI) problem. However, due to curse of dimensionality, it might not be possible to utilize the PI formulation for obtaining comprehensive solutions over the entire operating domain. A neural network structure called the adaptive critic design paradigm is used to effectively handle this difficulty. In this paper, a novel adaptive critic approach using the PI formulation is proposed for solving stochastic optimal control problems. The potential of the algorithm is demonstrated through simulation results from a couple of benchmark problems.

摘要

本文提出了一种基于神经网络的离线近似动态规划方法，用于求解一类有限时域随机最优控制问题。文献中有两种方法，一种基于随机极大值原理（SMP）形式，另一种基于求解随机 Hamilton-Jacobi-Bellman（HJB）方程。然而，在存在噪声的情况下，SMP 形式变得复杂，导致必须求解几个向后随机微分方程。因此，当前的解决方案方法通常忽略噪声效应。另一方面，在 HJB 框架中包含噪声非常简单。此外，具有二次控制成本函数和任意状态成本函数的控制仿射非线性随机系统的随机 HJB 方程可以表示为路径积分（PI）问题。然而，由于维度诅咒，可能无法利用 PI 公式在整个操作域内获得全面的解决方案。一种称为自适应评论家设计范例的神经网络结构用于有效地处理此困难。本文提出了一种基于 PI 公式的新的自适应评论家方法，用于求解随机最优控制问题。通过几个基准问题的仿真结果证明了该算法的潜力。

相似文献

Neural Network-Based Solutions for Stochastic Optimal Control Using Path Integrals.基于神经网络的路径积分随机最优控制解决方案。

IEEE Trans Neural Netw Learn Syst. 2017 Mar;28(3):534-545. doi: 10.1109/TNNLS.2016.2544787.

Design of nonlinear optimal control for chaotic synchronization of coupled stochastic neural networks via Hamilton-Jacobi-Bellman equation.基于 Hamilton-Jacobi-Bellman 方程的耦合随机神经网络混沌同步的非线性最优控制设计。

Neural Netw. 2018 Mar;99:166-177. doi: 10.1016/j.neunet.2018.01.003. Epub 2018 Feb 7.

Policy-Iteration-Based Finite-Horizon Approximate Dynamic Programming for Continuous-Time Nonlinear Optimal Control.基于策略迭代的连续时间非线性最优控制有限时域近似动态规划

IEEE Trans Neural Netw Learn Syst. 2023 Sep;34(9):5255-5267. doi: 10.1109/TNNLS.2022.3225090. Epub 2023 Sep 1.

Finite-Time Adaptive Dynamic Programming for Affine-Form Nonlinear Systems.仿射形式非线性系统的有限时间自适应动态规划

IEEE Trans Neural Netw Learn Syst. 2025 Feb;36(2):3573-3586. doi: 10.1109/TNNLS.2023.3337387. Epub 2025 Feb 6.

A policy iteration approach to online optimal control of continuous-time constrained-input systems.一种连续时间约束输入系统在线最优控制的策略迭代方法。

ISA Trans. 2013 Sep;52(5):611-21. doi: 10.1016/j.isatra.2013.04.004. Epub 2013 May 24.

Finite-horizon control-constrained nonlinear optimal control using single network adaptive critics.使用单网络自适应评论家的有限时域控制约束非线性最优控制。

IEEE Trans Neural Netw Learn Syst. 2013 Jan;24(1):145-57. doi: 10.1109/TNNLS.2012.2227339.

Data-Driven Finite-Horizon Approximate Optimal Control for Discrete-Time Nonlinear Systems Using Iterative HDP Approach.基于迭代 HDP 方法的数据驱动的离散时间非线性系统有限时域近似最优控制。

IEEE Trans Cybern. 2018 Oct;48(10):2948-2961. doi: 10.1109/TCYB.2017.2752845. Epub 2017 Oct 10.

Finite-Horizon Optimal Consensus Control for Unknown Multiagent State-Delay Systems.有限时域最优共识控制的未知多智能体时滞系统。

IEEE Trans Cybern. 2020 Feb;50(2):402-413. doi: 10.1109/TCYB.2018.2856510. Epub 2018 Sep 10.

Generalized hamilton-jacobi-bellman formulation -based neural network control of affine nonlinear discrete-time systems.基于广义哈密顿-雅可比-贝尔曼公式的仿射非线性离散时间系统神经网络控制

IEEE Trans Neural Netw. 2008 Jan;19(1):90-106. doi: 10.1109/TNN.2007.900227.

Event-Triggered Adaptive Dynamic Programming for Continuous-Time Systems With Control Constraints.具有控制约束的连续时间系统的事件触发自适应动态规划

IEEE Trans Neural Netw Learn Syst. 2017 Aug;28(8):1941-1952. doi: 10.1109/TNNLS.2016.2586303. Epub 2016 Aug 31.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于神经网络的路径积分随机最优控制解决方案。

Neural Network-Based Solutions for Stochastic Optimal Control Using Path Integrals.

出版信息

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献