• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于异步迭代 Q 学习的非线性离散时间多智能体系统跟踪控制。

Asynchronous iterative Q-learning based tracking control for nonlinear discrete-time multi-agent systems.

机构信息

College of Electronics and Information Engineering, Southwest University, Chongqing, 400715, PR China.

College of Electronics and Information Engineering, Southwest University, Chongqing, 400715, PR China.

出版信息

Neural Netw. 2024 Dec;180:106667. doi: 10.1016/j.neunet.2024.106667. Epub 2024 Aug 26.

DOI:10.1016/j.neunet.2024.106667
PMID:39216294
Abstract

This paper addresses the tracking control problem of nonlinear discrete-time multi-agent systems (MASs). First, a local neighborhood error system (LNES) is constructed. Then, a novel tracking algorithm based on asynchronous iterative Q-learning (AIQL) is developed, which can transform the tracking problem into the optimal regulation of LNES. The AIQL-based algorithm has two Q values Q and Q for each agent i, where Q is used for improving the control policy and Q is used for evaluating the value of the control policy. Moreover, the convergence of LNES is given. It is shown that the LNES converges to 0 and the tracking problem is solved. A neural network-based actor-critic framework is used to implement AIQL. The critic network of AIQL is composed of two neural networks, which are used for approximating Q and Q respectively. Finally, simulation results are given to verify the performance of the developed algorithm. It is shown that the AIQL-based tracking algorithm has a lower cost value and faster convergence speed than the IQL-based tracking algorithm.

摘要

本文针对非线性离散时间多智能体系统(MASs)的跟踪控制问题进行了研究。首先,构建了一个局部邻域误差系统(LNES)。然后,提出了一种基于异步迭代 Q 学习(AIQL)的新型跟踪算法,该算法可以将跟踪问题转化为 LNES 的最优调节问题。基于 AIQL 的算法为每个智能体 i 都有两个 Q 值 Q 和 Q,其中 Q 用于改进控制策略,Q 用于评估控制策略的价值。此外,还给出了 LNES 的收敛性。结果表明,LNES 收敛到 0,跟踪问题得到解决。采用神经网络的 Actor-Critic 框架来实现 AIQL。AIQL 的评价网络由两个神经网络组成,分别用于逼近 Q 和 Q。最后,给出了仿真结果以验证所提出算法的性能。结果表明,基于 AIQL 的跟踪算法的代价值更低,收敛速度更快。

相似文献

1
Asynchronous iterative Q-learning based tracking control for nonlinear discrete-time multi-agent systems.基于异步迭代 Q 学习的非线性离散时间多智能体系统跟踪控制。
Neural Netw. 2024 Dec;180:106667. doi: 10.1016/j.neunet.2024.106667. Epub 2024 Aug 26.
2
Neural Q-learning for discrete-time nonlinear zero-sum games with adjustable convergence rate.具有可调收敛速度的离散时间非线性零和博弈的神经 Q 学习。
Neural Netw. 2024 Jul;175:106274. doi: 10.1016/j.neunet.2024.106274. Epub 2024 Mar 27.
3
Novel optimal trajectory tracking for nonlinear affine systems with an advanced critic learning structure.具有先进评价学习结构的非线性仿射系统的新型最优轨迹跟踪。
Neural Netw. 2022 Oct;154:131-140. doi: 10.1016/j.neunet.2022.07.019. Epub 2022 Jul 16.
4
Asynchronous learning for actor-critic neural networks and synchronous triggering for multiplayer system.异步学习的演员-批评神经网络和同步触发的多人系统。
ISA Trans. 2022 Oct;129(Pt B):295-308. doi: 10.1016/j.isatra.2022.02.007. Epub 2022 Feb 10.
5
Neural critic learning with accelerated value iteration for nonlinear model predictive control.神经批评学习与加速价值迭代的非线性模型预测控制。
Neural Netw. 2024 Aug;176:106364. doi: 10.1016/j.neunet.2024.106364. Epub 2024 May 6.
6
Off-Policy Interleaved Q -Learning: Optimal Control for Affine Nonlinear Discrete-Time Systems.离策略交错Q学习:仿射非线性离散时间系统的最优控制
IEEE Trans Neural Netw Learn Syst. 2019 May;30(5):1308-1320. doi: 10.1109/TNNLS.2018.2861945. Epub 2018 Sep 26.
7
Model-Free Optimal Tracking Control via Critic-Only Q-Learning.基于仅评价器 Q 学习的无模型最优跟踪控制。
IEEE Trans Neural Netw Learn Syst. 2016 Oct;27(10):2134-44. doi: 10.1109/TNNLS.2016.2585520. Epub 2016 Jul 12.
8
Neural network robust tracking control with adaptive critic framework for uncertain nonlinear systems.基于自适应 critic 框架的不确定非线性系统神经网络鲁棒跟踪控制。
Neural Netw. 2018 Jan;97:11-18. doi: 10.1016/j.neunet.2017.09.005. Epub 2017 Sep 21.
9
Model-Free Optimal Tracking Control of Nonlinear Input-Affine Discrete-Time Systems via an Iterative Deterministic Q-Learning Algorithm.基于迭代确定性Q学习算法的非线性输入仿射离散时间系统的无模型最优跟踪控制
IEEE Trans Neural Netw Learn Syst. 2022 Jun 3;PP. doi: 10.1109/TNNLS.2022.3178746.
10
Data-Driven H Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning.基于离策略Q学习的线性离散时间系统数据驱动H最优输出反馈控制
IEEE Trans Neural Netw Learn Syst. 2023 Jul;34(7):3553-3567. doi: 10.1109/TNNLS.2021.3112457. Epub 2023 Jul 6.