• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于行动建议和经验共享的多智能体强化学习混合知识转移

Hybrid knowledge transfer for MARL based on action advising and experience sharing.

作者信息

Liu Feng, Li Dongqi, Gao Jian

机构信息

School of Marine Science and Technology, Northwestern Polytechnical University, Xi'an, China.

Kunming Precision Machinery Research Institute, Kunming, China.

出版信息

Front Neurorobot. 2024 May 7;18:1364587. doi: 10.3389/fnbot.2024.1364587. eCollection 2024.

DOI:10.3389/fnbot.2024.1364587
PMID:38774520
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11106758/
Abstract

Multiagent Reinforcement Learning (MARL) has been well adopted due to its exceptional ability to solve multiagent decision-making problems. To further enhance learning efficiency, knowledge transfer algorithms have been developed, among which experience-sharing-based and action-advising-based transfer strategies share the mainstream. However, it is notable that, although there exist many successful applications of both strategies, they are not flawless. For the long-developed action-advising-based methods (namely KT-AA, short for knowledge transfer based on action advising), their data efficiency and scalability are not satisfactory. As for the newly proposed experience-sharing-based knowledge transfer methods (KT-ES), although the shortcomings of KT-AA have been partially overcome, they are incompetent to correct specific bad decisions in the later learning stage. To leverage the superiority of both KT-AA and KT-ES, this study proposes KT-Hybrid, a hybrid knowledge transfer approach. In the early learning phase, KT-ES methods are employed, expecting better data efficiency from KT-ES to enhance the policy to a basic level as soon as possible. Later, we focus on correcting specific errors made by the basic policy, trying to use KT-AA methods to further improve the performance. Simulations demonstrate that the proposed KT-Hybrid outperforms well-received action-advising- and experience-sharing-based methods.

摘要

多智能体强化学习(MARL)因其解决多智能体决策问题的卓越能力而得到广泛应用。为了进一步提高学习效率,人们开发了知识转移算法,其中基于经验共享和基于行动建议的转移策略占据主流。然而,值得注意的是,尽管这两种策略都有许多成功的应用,但它们并非完美无缺。对于长期发展的基于行动建议的方法(即KT-AA,基于行动建议的知识转移的简称),其数据效率和可扩展性并不令人满意。至于新提出的基于经验共享的知识转移方法(KT-ES),虽然部分克服了KT-AA的缺点,但它们在后期学习阶段无法纠正特定的错误决策。为了利用KT-AA和KT-ES的优势,本研究提出了KT-Hybrid,一种混合知识转移方法。在早期学习阶段,采用KT-ES方法,期望从KT-ES获得更好的数据效率,以便尽快将策略提升到基本水平。后期,我们专注于纠正基本策略所犯的特定错误,尝试使用KT-AA方法进一步提高性能。仿真结果表明,所提出的KT-Hybrid优于广受欢迎的基于行动建议和基于经验共享的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/821c/11106758/4a20421a92a4/fnbot-18-1364587-g0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/821c/11106758/e94bd560f791/fnbot-18-1364587-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/821c/11106758/1ca1d9736c11/fnbot-18-1364587-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/821c/11106758/b58e5a73536f/fnbot-18-1364587-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/821c/11106758/823348986190/fnbot-18-1364587-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/821c/11106758/e848913aad46/fnbot-18-1364587-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/821c/11106758/d1188ce9ad3e/fnbot-18-1364587-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/821c/11106758/9fb9d1842d26/fnbot-18-1364587-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/821c/11106758/4a20421a92a4/fnbot-18-1364587-g0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/821c/11106758/e94bd560f791/fnbot-18-1364587-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/821c/11106758/1ca1d9736c11/fnbot-18-1364587-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/821c/11106758/b58e5a73536f/fnbot-18-1364587-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/821c/11106758/823348986190/fnbot-18-1364587-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/821c/11106758/e848913aad46/fnbot-18-1364587-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/821c/11106758/d1188ce9ad3e/fnbot-18-1364587-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/821c/11106758/9fb9d1842d26/fnbot-18-1364587-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/821c/11106758/4a20421a92a4/fnbot-18-1364587-g0008.jpg

相似文献

1
Hybrid knowledge transfer for MARL based on action advising and experience sharing.基于行动建议和经验共享的多智能体强化学习混合知识转移
Front Neurorobot. 2024 May 7;18:1364587. doi: 10.3389/fnbot.2024.1364587. eCollection 2024.
2
Accelerating Multiagent Reinforcement Learning by Equilibrium Transfer.通过均衡转移加速多智能体强化学习。
IEEE Trans Cybern. 2015 Jul;45(7):1289-302. doi: 10.1109/TCYB.2014.2349152. Epub 2014 Aug 29.
3
Lateral Transfer Learning for Multiagent Reinforcement Learning.多智能体强化学习的横向迁移学习。
IEEE Trans Cybern. 2023 Mar;53(3):1699-1711. doi: 10.1109/TCYB.2021.3108237. Epub 2023 Feb 15.
4
Multiagent Reinforcement Learning With Sparse Interactions by Negotiation and Knowledge Transfer.通过谈判和知识转移进行稀疏交互的多智能体强化学习。
IEEE Trans Cybern. 2017 May;47(5):1238-1250. doi: 10.1109/TCYB.2016.2543238. Epub 2016 Mar 31.
5
Multiexperience-Assisted Efficient Multiagent Reinforcement Learning.多体验辅助的高效多智能体强化学习
IEEE Trans Neural Netw Learn Syst. 2024 Sep;35(9):12678-12692. doi: 10.1109/TNNLS.2023.3264275. Epub 2024 Sep 3.
6
Strangeness-driven exploration in multi-agent reinforcement learning.多智能体强化学习中的奇异驱动探索。
Neural Netw. 2024 Apr;172:106149. doi: 10.1016/j.neunet.2024.106149. Epub 2024 Jan 26.
7
Knowledge Reuse of Multi-Agent Reinforcement Learning in Cooperative Tasks.多智能体强化学习在合作任务中的知识重用
Entropy (Basel). 2022 Mar 28;24(4):470. doi: 10.3390/e24040470.
8
Sharing What We Know about Living a Good Life: Indigenous Approaches to Knowledge Translation.分享我们所知道的美好生活:本土知识转化方法。
J Can Health Libr Assoc. 2014;35:16-23. doi: 10.5596/c14-009.
9
Large-Scale Traffic Signal Control Using a Novel Multiagent Reinforcement Learning.基于新型多智能体强化学习的大规模交通信号控制
IEEE Trans Cybern. 2021 Jan;51(1):174-187. doi: 10.1109/TCYB.2020.3015811. Epub 2020 Dec 22.
10
KnowRU: Knowledge Reuse via Knowledge Distillation in Multi-Agent Reinforcement Learning.KnowRU:多智能体强化学习中通过知识蒸馏实现的知识复用
Entropy (Basel). 2021 Aug 13;23(8):1043. doi: 10.3390/e23081043.

本文引用的文献

1
Magnetic control of tokamak plasmas through deep reinforcement learning.通过深度强化学习控制托卡马克等离子体。
Nature. 2022 Feb;602(7897):414-419. doi: 10.1038/s41586-021-04301-9. Epub 2022 Feb 16.
2
A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play.一种通过自我对弈掌握国际象棋、将棋和围棋的通用强化学习算法。
Science. 2018 Dec 7;362(6419):1140-1144. doi: 10.1126/science.aar6404.
3
Multiagent cooperation and competition with deep reinforcement learning.基于深度强化学习的多智能体合作与竞争
PLoS One. 2017 Apr 5;12(4):e0172395. doi: 10.1371/journal.pone.0172395. eCollection 2017.
4
Human-level control through deep reinforcement learning.通过深度强化学习实现人类水平的控制。
Nature. 2015 Feb 26;518(7540):529-33. doi: 10.1038/nature14236.
5
Stochastic Games.随机博弈
Proc Natl Acad Sci U S A. 1953 Oct;39(10):1095-100. doi: 10.1073/pnas.39.10.1095.