• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于神经网络的具有精度保证的整合价值迭代折扣最优控制。

Neural-network-based discounted optimal control via an integrated value iteration with accuracy guarantee.

机构信息

School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, China.

Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China; Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing 100124, China.

出版信息

Neural Netw. 2021 Dec;144:176-186. doi: 10.1016/j.neunet.2021.08.025. Epub 2021 Aug 28.

DOI:10.1016/j.neunet.2021.08.025
PMID:34500256
Abstract

A data-based value iteration algorithm with the bidirectional approximation feature is developed for discounted optimal control. The unknown nonlinear system dynamics is first identified by establishing a model neural network. To improve the identification precision, biases are introduced to the model network. The model network with biases is trained by the gradient descent algorithm, where the weights and biases across all layers are updated. The uniform ultimate boundedness stability with a proper learning rate is analyzed, by using the Lyapunov approach. Moreover, an integrated value iteration with the discounted cost is developed to fully guarantee the approximation accuracy of the optimal value function. Then, the effectiveness of the proposed algorithm is demonstrated by carrying out two simulation examples with physical backgrounds.

摘要

针对折扣最优控制问题,提出了一种具有双向逼近特性的数据基值迭代算法。首先,通过建立模型神经网络来识别未知的非线性系统动态。为了提高识别精度,在模型网络中引入了偏差。通过梯度下降算法对具有偏差的模型网络进行训练,其中更新所有层的权重和偏差。利用 Lyapunov 方法分析了具有适当学习率的一致最终有界稳定性。此外,还开发了具有折扣成本的综合值迭代算法,以充分保证最优值函数的逼近精度。然后,通过两个具有物理背景的仿真示例验证了所提出算法的有效性。

相似文献

1
Neural-network-based discounted optimal control via an integrated value iteration with accuracy guarantee.基于神经网络的具有精度保证的整合价值迭代折扣最优控制。
Neural Netw. 2021 Dec;144:176-186. doi: 10.1016/j.neunet.2021.08.025. Epub 2021 Aug 28.
2
Improved value iteration for neural-network-based stochastic optimal control design.基于神经网络的随机最优控制设计的改进价值迭代。
Neural Netw. 2020 Apr;124:280-295. doi: 10.1016/j.neunet.2020.01.004. Epub 2020 Jan 28.
3
Event-Triggered ADP for Tracking Control of Partially Unknown Constrained Uncertain Systems.事件触发 ADP 用于部分未知约束不确定系统的跟踪控制。
IEEE Trans Cybern. 2022 Sep;52(9):9001-9012. doi: 10.1109/TCYB.2021.3054626. Epub 2022 Aug 18.
4
An Approximate Neuro-Optimal Solution of Discounted Guaranteed Cost Control Design.折扣保性能控制设计的一种近似神经最优解
IEEE Trans Cybern. 2022 Jan;52(1):77-86. doi: 10.1109/TCYB.2020.2977318. Epub 2022 Jan 11.
5
Adaptive optimal control of affine nonlinear systems via identifier-critic neural network approximation with relaxed PE conditions.基于放松的 PE 条件的辨识 - 评论神经网络逼近的仿射非线性系统自适应最优控制。
Neural Netw. 2023 Oct;167:588-600. doi: 10.1016/j.neunet.2023.08.044. Epub 2023 Sep 1.
6
Neural critic learning with accelerated value iteration for nonlinear model predictive control.神经批评学习与加速价值迭代的非线性模型预测控制。
Neural Netw. 2024 Aug;176:106364. doi: 10.1016/j.neunet.2024.106364. Epub 2024 May 6.
7
Particle swarm optimized neural networks based local tracking control scheme of unknown nonlinear interconnected systems.基于粒子群优化神经网络的未知非线性互联系统局部跟踪控制方案。
Neural Netw. 2021 Feb;134:54-63. doi: 10.1016/j.neunet.2020.09.020. Epub 2020 Nov 11.
8
Adaptive Reinforcement Learning Neural Network Control for Uncertain Nonlinear System With Input Saturation.具有输入饱和的不确定非线性系统的自适应强化学习神经网络控制。
IEEE Trans Cybern. 2020 Aug;50(8):3433-3443. doi: 10.1109/TCYB.2019.2921057. Epub 2019 Jun 26.
9
Error bounds of adaptive dynamic programming algorithms for solving undiscounted optimal control problems.自适应动态规划算法求解非折扣最优控制问题的误差界。
IEEE Trans Neural Netw Learn Syst. 2015 Jun;26(6):1323-34. doi: 10.1109/TNNLS.2015.2402203. Epub 2015 Mar 3.
10
Adaptive optimal control of unknown constrained-input systems using policy iteration and neural networks.基于策略迭代和神经网络的未知约束输入系统自适应最优控制。
IEEE Trans Neural Netw Learn Syst. 2013 Oct;24(10):1513-25. doi: 10.1109/TNNLS.2013.2276571.