具有交互和任务表示的多任务多智能体强化学习

Multi-Task Multi-Agent Reinforcement Learning With Interaction and Task Representations.

作者信息

Li Chao, Dong Shaokang, Yang Shangdong, Hu Yujing, Ding Tianyu, Li Wenbin, Gao Yang

出版信息

IEEE Trans Neural Netw Learn Syst. 2025 Jul;36(7):13431-13445. doi: 10.1109/TNNLS.2024.3475216.

DOI:10.1109/TNNLS.2024.3475216

Abstract

Multi-task multi-agent reinforcement learning (MT-MARL) is capable of leveraging useful knowledge across multiple related tasks to improve performance on any single task. While recent studies have tentatively achieved this by learning independent policies on a shared representation space, we pinpoint that further advancements can be realized by explicitly characterizing agent interactions within these multi-agent tasks and identifying task relations for selective reuse. To this end, this article proposes Representing Interactions and Tasks (RIT), a novel MT-MARL algorithm that characterizes both intra-task agent interactions and inter-task task relations. Specifically, for characterizing agent interactions, RIT presents the interactive value decomposition to explicitly take the dependency among agents into policy learning. Theoretical analysis demonstrates that the learned utility value of each agent approximates its Shapley value, thus representing agent interactions. Moreover, we learn task representations based on per-agent local trajectories, which assess task similarities and accordingly identify task relations. As a result, RIT facilitates the effective transfer of interaction knowledge across similar multi-agent tasks. Structurally, RIT develops universal policy structure for scalable multi-task policy learning. We evaluate RIT against multiple state-of-the-art baselines in various cooperative tasks, and its significant performance under both multi-task and zero-shot settings demonstrates its effectiveness.

摘要

多任务多智能体强化学习（MT-MARL）能够利用多个相关任务中的有用知识来提高任何单个任务的性能。虽然最近的研究通过在共享表征空间上学习独立策略初步实现了这一点，但我们指出，通过明确表征这些多智能体任务中的智能体交互并识别任务关系以进行选择性重用，可以实现进一步的进展。为此，本文提出了交互与任务表征（RIT），这是一种新颖的MT-MARL算法，它同时表征任务内智能体交互和任务间任务关系。具体而言，为了表征智能体交互，RIT提出了交互价值分解，以在策略学习中明确考虑智能体之间的依赖性。理论分析表明，每个智能体学习到的效用值近似其夏普值，从而表征智能体交互。此外，我们基于每个智能体的局部轨迹学习任务表征，评估任务相似性并据此识别任务关系。结果，RIT促进了交互知识在相似多智能体任务之间的有效转移。在结构上，RIT开发了用于可扩展多任务策略学习的通用策略结构。我们在各种合作任务中针对多个最新基线评估了RIT，其在多任务和零样本设置下的显著性能证明了其有效性。

相似文献

Multi-Task Multi-Agent Reinforcement Learning With Interaction and Task Representations.具有交互和任务表示的多任务多智能体强化学习

IEEE Trans Neural Netw Learn Syst. 2025 Jul;36(7):13431-13445. doi: 10.1109/TNNLS.2024.3475216.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中，如果患者出现以下症状和体征，可判断其是否患有 COVID-19。

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.系统性药理学治疗慢性斑块状银屑病：网络荟萃分析。

Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.

Interventions for patients and caregivers to improve knowledge of sickle cell disease and recognition of its related complications.针对患者及护理人员的干预措施，以提高对镰状细胞病的认识及其相关并发症的识别能力。

Cochrane Database Syst Rev. 2016 Oct 6;10(10):CD011175. doi: 10.1002/14651858.CD011175.pub2.

Comparison of cellulose, modified cellulose and synthetic membranes in the haemodialysis of patients with end-stage renal disease.纤维素、改性纤维素和合成膜在终末期肾病患者血液透析中的比较。

Cochrane Database Syst Rev. 2001(3):CD003234. doi: 10.1002/14651858.CD003234.

The measurement of collaboration within healthcare settings: a systematic review of measurement properties of instruments.医疗机构内协作的测量：对测量工具属性的系统评价

JBI Database System Rev Implement Rep. 2016 Apr;14(4):138-97. doi: 10.11124/JBISRIR-2016-2159.

Management of urinary stones by experts in stone disease (ESD 2025).结石病专家对尿路结石的管理（2025年结石病专家共识）

Arch Ital Urol Androl. 2025 Jun 30;97(2):14085. doi: 10.4081/aiua.2025.14085.

Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.慢性斑块状银屑病的全身药理学治疗：一项网状Meta分析。

Cochrane Database Syst Rev. 2020 Jan 9;1(1):CD011535. doi: 10.1002/14651858.CD011535.pub3.

Audit and feedback: effects on professional practice.审核与反馈：对专业实践的影响

Cochrane Database Syst Rev. 2025 Mar 25;3(3):CD000259. doi: 10.1002/14651858.CD000259.pub4.

Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.慢性斑块状银屑病的全身药理学治疗：一项网状荟萃分析。

Cochrane Database Syst Rev. 2017 Dec 22;12(12):CD011535. doi: 10.1002/14651858.CD011535.pub2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

具有交互和任务表示的多任务多智能体强化学习

Multi-Task Multi-Agent Reinforcement Learning With Interaction and Task Representations.

作者信息

出版信息

相似文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献