• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于深度神经网络的不确定非线性连续时间严格反馈系统的在线终身最优跟踪控制

Online lifelong optimal tracking control of uncertain nonlinear continuous-time strict-feedback systems using deep neural networks.

作者信息

Ganie Irfan, Jagannathan S

机构信息

Dept. of Electrical and Computer Engineering, Missouri University of Science and Technology, Rolla, 65401, MO, USA.

出版信息

Neural Netw. 2025 Nov;191:107793. doi: 10.1016/j.neunet.2025.107793. Epub 2025 Jul 5.

DOI:10.1016/j.neunet.2025.107793
PMID:40633288
Abstract

A novel integral reinforcement learning (IRL)-based optimal trajectory tracking scheme for nonlinear continuous-time systems in strict feedback form is introduced by using backstepping and multilayer or deep neural networks (DNNs). The proposed method employs a dynamic surface control-based technique in an optimal framework to relax the need for repeatedly computing the derivatives of virtual controllers at each step of the backstepping process. An online singular value decomposition (SVD)-of the activation function gradient-based actor-critic DNN at each step of the backstepping process is employed to minimize a discounted value function. Novel online SVD-based weight update laws, which are shown to mitigate vanishing gradient, for the actor and critic DNNs are derived by using control input error and Bellman error respectively. A new online lifelong learning (LL) technique using Bellman residual and control input errors to overcome the issue of catastrophic forgetting in both critic and actor DNNs is also attempted, and closed-loop stability is analyzed and demonstrated. The effectiveness of the proposed method is shown in simulation on mobile robot tracking and ship autopilot, which demonstrates a 76% total cost reduction when compared to the literature.

摘要

通过使用反步法和多层或深度神经网络(DNN),引入了一种基于新型积分强化学习(IRL)的严格反馈形式非线性连续时间系统最优轨迹跟踪方案。所提出的方法在最优框架中采用基于动态表面控制的技术,以缓解在反步过程的每个步骤中反复计算虚拟控制器导数的需求。在反步过程的每个步骤中,采用基于激活函数梯度的演员-评论家DNN的在线奇异值分解(SVD)来最小化折扣值函数。分别使用控制输入误差和贝尔曼误差,推导出用于演员和评论家DNN的基于新型在线SVD的权重更新定律,该定律可减轻梯度消失问题。还尝试了一种使用贝尔曼残差和控制输入误差的新型在线终身学习(LL)技术,以克服评论家与演员DNN中的灾难性遗忘问题,并分析和证明了闭环稳定性。所提方法的有效性在移动机器人跟踪和船舶自动驾驶仪的仿真中得到了验证,与文献相比,总成本降低了76%。

相似文献

1
Online lifelong optimal tracking control of uncertain nonlinear continuous-time strict-feedback systems using deep neural networks.基于深度神经网络的不确定非线性连续时间严格反馈系统的在线终身最优跟踪控制
Neural Netw. 2025 Nov;191:107793. doi: 10.1016/j.neunet.2025.107793. Epub 2025 Jul 5.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Data-based decentralized control of nonlinear-constrained interconnected systems using reinforcement learning.基于数据的强化学习对非线性约束互联系统的分散控制
Neural Netw. 2025 Nov;191:107780. doi: 10.1016/j.neunet.2025.107780. Epub 2025 Jun 30.
4
Event-triggered ADP-based tracking controller for partially unknown nonlinear uncertain systems with input and state constraints.具有输入和状态约束的部分未知非线性不确定系统的事件触发基于自适应动态规划的跟踪控制器
Neural Netw. 2025 Nov;191:107752. doi: 10.1016/j.neunet.2025.107752. Epub 2025 Jun 21.
5
Neuro-XAI: Explainable deep learning framework based on deeplabV3+ and bayesian optimization for segmentation and classification of brain tumor in MRI scans.Neuro-XAI:基于deeplabV3+和贝叶斯优化的可解释深度学习框架,用于磁共振成像扫描中脑肿瘤的分割和分类。
J Neurosci Methods. 2024 Oct;410:110247. doi: 10.1016/j.jneumeth.2024.110247. Epub 2024 Aug 10.
6
Data-driven optimal tracking control for nonlinear systems with performance constraints via adaptive dynamic programming.基于自适应动态规划的具有性能约束的非线性系统数据驱动最优跟踪控制
Neural Netw. 2025 Nov;191:107852. doi: 10.1016/j.neunet.2025.107852. Epub 2025 Jul 9.
7
Actor critic with experience replay-based automatic treatment planning for prostate cancer intensity modulated radiotherapy.基于经验回放的演员-评论家算法用于前列腺癌调强放射治疗的自动治疗计划
Med Phys. 2025 Jul;52(7):e17915. doi: 10.1002/mp.17915. Epub 2025 May 31.
8
Radiogenomic explainable AI with neural ordinary differential equation for identifying post-SRS brain metastasis radionecrosis.基于神经常微分方程的可解释放射基因组人工智能用于识别立体定向放射治疗后脑转移瘤放射性坏死。
Med Phys. 2025 Apr;52(4):2661-2674. doi: 10.1002/mp.17635. Epub 2025 Jan 29.
9
Broad Critic Deep Actor Reinforcement Learning for Continuous Control.用于连续控制的广义批评深度演员强化学习
IEEE Trans Neural Netw Learn Syst. 2025 Apr 8;PP. doi: 10.1109/TNNLS.2025.3554082.
10
Lifelong Learning-Based Optimal Trajectory Tracking Control of Constrained Nonlinear Affine Systems Using Deep Neural Networks.
IEEE Trans Cybern. 2024 Dec;54(12):7133-7146. doi: 10.1109/TCYB.2024.3405354. Epub 2024 Nov 27.