用于自适应多路径路由优化的多智能体元强化学习

Multiagent Meta-Reinforcement Learning for Adaptive Multipath Routing Optimization.

作者信息

Chen Long, Hu Bin, Guan Zhi-Hong, Zhao Lian, Shen Xuemin

出版信息

IEEE Trans Neural Netw Learn Syst. 2022 Oct;33(10):5374-5386. doi: 10.1109/TNNLS.2021.3070584. Epub 2022 Oct 5.

DOI:10.1109/TNNLS.2021.3070584

PMID:33881997

Abstract

In this article, we investigate the routing problem of packet networks through multiagent reinforcement learning (RL), which is a very challenging topic in distributed and autonomous networked systems. In specific, the routing problem is modeled as a networked multiagent partially observable Markov decision process (MDP). Since the MDP of a network node is not only affected by its neighboring nodes' policies but also the network traffic demand, it becomes a multitask learning problem. Inspired by recent success of RL and metalearning, we propose two novel model-free multiagent RL algorithms, named multiagent proximal policy optimization (MAPPO) and multiagent metaproximal policy optimization (meta-MAPPO), to optimize the network performances under fixed and time-varying traffic demand, respectively. A practicable distributed implementation framework is designed based on the separability of exploration and exploitation in training MAPPO. Compared with the existing routing optimization policies, our simulation results demonstrate the excellent performances of the proposed algorithms.

摘要

在本文中，我们通过多智能体强化学习（RL）研究分组网络的路由问题，这在分布式和自治网络系统中是一个极具挑战性的课题。具体而言，路由问题被建模为一个网络化多智能体部分可观测马尔可夫决策过程（MDP）。由于网络节点的MDP不仅受其相邻节点策略的影响，还受网络流量需求的影响，这就变成了一个多任务学习问题。受近期强化学习和元学习成功的启发，我们提出了两种新颖的无模型多智能体强化学习算法，分别称为多智能体近端策略优化（MAPPO）和多智能体元近端策略优化（meta - MAPPO），以分别在固定和时变流量需求下优化网络性能。基于MAPPO训练中探索与利用的可分离性，设计了一个可行的分布式实现框架。与现有的路由优化策略相比，我们的仿真结果证明了所提算法的优异性能。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于自适应多路径路由优化的多智能体元强化学习

Multiagent Meta-Reinforcement Learning for Adaptive Multipath Routing Optimization.

作者信息

出版信息

相似文献

引用本文的文献

用于自适应多路径路由优化的多智能体元强化学习

Multiagent Meta-Reinforcement Learning for Adaptive Multipath Routing Optimization.

作者信息

出版信息

相似文献

引用本文的文献