随机环境下合作多智能体系统的模型学习与知识共享

Model Learning and Knowledge Sharing for Cooperative Multiagent Systems in Stochastic Environment.

作者信息

Jiang Wei-Cheng, Narayanan Vignesh, Li Jr-Shin

出版信息

IEEE Trans Cybern. 2021 Dec;51(12):5717-5727. doi: 10.1109/TCYB.2019.2958912. Epub 2021 Dec 22.

DOI:10.1109/TCYB.2019.2958912

PMID:31944970

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7338261/

Abstract

An imposing task for a reinforcement learning agent in an uncertain environment is to expeditiously learn a policy or a sequence of actions, with which it can achieve the desired goal. In this article, we present an incremental model learning scheme to reconstruct the model of a stochastic environment. In the proposed learning scheme, we introduce a clustering algorithm to assimilate the model information and estimate the probability for each state transition. In addition, utilizing the reconstructed model, we present an experience replay strategy to create virtual interactive experiences by incorporating a balance between exploration and exploitation, which greatly accelerates learning and enables planning. Furthermore, we extend the proposed learning scheme for a multiagent framework to decrease the effort required for exploration and to reduce the learning time in a large environment. In this multiagent framework, we introduce a knowledge-sharing algorithm to share the reconstructed model information among the different agents, as needed, and develop a computationally efficient knowledge fusing mechanism to fuse the knowledge acquired using the agents' own experience with the knowledge received from its teammates. Finally, the simulation results with comparative analysis are provided to demonstrate the efficacy of the proposed methods in the complex learning tasks.

摘要

对于强化学习智能体而言，在不确定环境中一项艰巨的任务是迅速学习一种策略或一系列行动，借此实现预期目标。在本文中，我们提出一种增量模型学习方案来重建随机环境的模型。在所提出的学习方案中，我们引入一种聚类算法来吸收模型信息并估计每次状态转移的概率。此外，利用重建的模型，我们提出一种经验回放策略，通过在探索与利用之间取得平衡来创建虚拟交互体验，这极大地加速了学习并实现了规划。再者，我们将所提出的学习方案扩展到多智能体框架，以减少探索所需的努力并缩短在大型环境中的学习时间。在这个多智能体框架中，我们引入一种知识共享算法，以便在需要时在不同智能体之间共享重建的模型信息，并开发一种计算高效的知识融合机制，将利用智能体自身经验获取的知识与从队友那里接收到的知识进行融合。最后，提供了带有对比分析的仿真结果，以证明所提方法在复杂学习任务中的有效性。

相似文献

Model Learning and Knowledge Sharing for Cooperative Multiagent Systems in Stochastic Environment.

IEEE Trans Cybern. 2021 Dec;51(12):5717-5727. doi: 10.1109/TCYB.2019.2958912. Epub 2021 Dec 22.

Model learning and knowledge sharing for a multiagent system with Dyna-Q learning.

IEEE Trans Cybern. 2015 May;45(5):964-76. doi: 10.1109/TCYB.2014.2341582. Epub 2014 Aug 5.

Hierarchical and Stable Multiagent Reinforcement Learning for Cooperative Navigation Control.

IEEE Trans Neural Netw Learn Syst. 2023 Jan;34(1):90-103. doi: 10.1109/TNNLS.2021.3089834. Epub 2023 Jan 5.

Data-Driven H∞ Output Consensus for Heterogeneous Multiagent Systems Under Switching Topology via Reinforcement Learning.

IEEE Trans Cybern. 2024 Dec;54(12):7865-7876. doi: 10.1109/TCYB.2024.3419056. Epub 2024 Nov 27.

Multiexperience-Assisted Efficient Multiagent Reinforcement Learning.

IEEE Trans Neural Netw Learn Syst. 2024 Sep;35(9):12678-12692. doi: 10.1109/TNNLS.2023.3264275. Epub 2024 Sep 3.

Learning Automata-Based Multiagent Reinforcement Learning for Optimization of Cooperative Tasks.

IEEE Trans Neural Netw Learn Syst. 2021 Oct;32(10):4639-4652. doi: 10.1109/TNNLS.2020.3025711. Epub 2021 Oct 5.

Consensus, cooperative learning, and flocking for multiagent predator avoidance.

Int J Adv Robot Syst. 2020 Sep 1;17(5). doi: 10.1177/1729881420960342. Epub 2020 Sep 24.

Lateral Transfer Learning for Multiagent Reinforcement Learning.

IEEE Trans Cybern. 2023 Mar;53(3):1699-1711. doi: 10.1109/TCYB.2021.3108237. Epub 2023 Feb 15.

Large-Scale Traffic Signal Control Using a Novel Multiagent Reinforcement Learning.

IEEE Trans Cybern. 2021 Jan;51(1):174-187. doi: 10.1109/TCYB.2020.3015811. Epub 2020 Dec 22.

Reinforcement Learning With Task Decomposition for Cooperative Multiagent Systems.

IEEE Trans Neural Netw Learn Syst. 2021 May;32(5):2054-2065. doi: 10.1109/TNNLS.2020.2996209. Epub 2021 May 3.

引用本文的文献

Learning to Control Neurons using Aggregated Measurements.

Proc Am Control Conf. 2020 Jul;2020:4028-4033. doi: 10.23919/acc45564.2020.9147426. Epub 2020 Jul 27.

Knowledge Reuse of Multi-Agent Reinforcement Learning in Cooperative Tasks.

Entropy (Basel). 2022 Mar 28;24(4):470. doi: 10.3390/e24040470.

本文引用的文献

MOO-MDP: An Object-Oriented Representation for Cooperative Multiagent Reinforcement Learning.

IEEE Trans Cybern. 2019 Feb;49(2):567-579. doi: 10.1109/TCYB.2017.2781130. Epub 2017 Dec 28.

Extending the Peak Bandwidth of Parameters for Softmax Selection in Reinforcement Learning.

IEEE Trans Neural Netw Learn Syst. 2017 Aug;28(8):1865-1877. doi: 10.1109/TNNLS.2016.2558295. Epub 2016 May 11.

FMRQ-A Multiagent Reinforcement Learning Algorithm for Fully Cooperative Tasks.

IEEE Trans Cybern. 2017 Jun;47(6):1367-1379. doi: 10.1109/TCYB.2016.2544866. Epub 2016 Apr 14.

Discrete-Time Deterministic $Q$ -Learning: A Novel Convergence Analysis.

IEEE Trans Cybern. 2017 May;47(5):1224-1237. doi: 10.1109/TCYB.2016.2542923. Epub 2016 Apr 11.

Model-Based Reinforcement Learning for Infinite-Horizon Approximate Optimal Tracking.

IEEE Trans Neural Netw Learn Syst. 2017 Mar;28(3):753-758. doi: 10.1109/TNNLS.2015.2511658. Epub 2016 Feb 3.

Parallel Online Temporal Difference Learning for Motor Control.

IEEE Trans Neural Netw Learn Syst. 2016 Jul;27(7):1457-68. doi: 10.1109/TNNLS.2015.2442233. Epub 2015 Jun 23.

Model learning and knowledge sharing for a multiagent system with Dyna-Q learning.

IEEE Trans Cybern. 2015 May;45(5):964-76. doi: 10.1109/TCYB.2014.2341582. Epub 2014 Aug 5.

A clustering-based graph Laplacian framework for value function approximation in reinforcement learning.

IEEE Trans Cybern. 2014 Dec;44(12):2613-25. doi: 10.1109/TCYB.2014.2311578. Epub 2014 Apr 25.

Expertness based cooperative Q-learning.

IEEE Trans Syst Man Cybern B Cybern. 2002;32(1):66-76. doi: 10.1109/3477.979961.

A study on expertise of agents and its effects on cooperative Q-learning.

IEEE Trans Syst Man Cybern B Cybern. 2007 Apr;37(2):398-409. doi: 10.1109/tsmcb.2006.883264.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

随机环境下合作多智能体系统的模型学习与知识共享

Model Learning and Knowledge Sharing for Cooperative Multiagent Systems in Stochastic Environment.

作者信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献