Suppr超能文献

多智能体强化学习在认知无人机网络中的联合协作频谱感知和信道接入。

Multi-Agent Reinforcement Learning for Joint Cooperative Spectrum Sensing and Channel Access in Cognitive UAV Networks.

机构信息

Communication Measurement and Control Center, Chongqing University, Chongqing 400044, China.

Faculty of Engineering, Bar Ilan University, Ramat Gan 5290002, Israel.

出版信息

Sensors (Basel). 2022 Feb 20;22(4):1651. doi: 10.3390/s22041651.

Abstract

This paper studies the problem of distributed spectrum/channel access for cognitive radio-enabled unmanned aerial vehicles (CUAVs) that overlay upon primary channels. Under the framework of cooperative spectrum sensing and opportunistic transmission, a one-shot optimization problem for channel allocation, aiming to maximize the expected cumulative weighted reward of multiple CUAVs, is formulated. To handle the uncertainty due to the lack of prior knowledge about the primary user activities as well as the lack of the channel-access coordinator, the original problem is cast into a competition and cooperation hybrid multi-agent reinforcement learning (CCH-MARL) problem in the framework of Markov game (MG). Then, a value-iteration-based RL algorithm, which features upper confidence bound-Hoeffding (UCB-H) strategy searching, is proposed by treating each CUAV as an independent learner (IL). To address the curse of dimensionality, the UCB-H strategy is further extended with a double deep Q-network (DDQN). Numerical simulations show that the proposed algorithms are able to efficiently converge to stable strategies, and significantly improve the network performance when compared with the benchmark algorithms such as the vanilla Q-learning and DDQN algorithms.

摘要

本文研究了认知无线电无人机(CUAV)在主信道上覆盖的分布式频谱/信道接入问题。在协作频谱感知和机会传输的框架下,针对多个 CUAV 的期望累积加权奖励最大化,提出了一种用于信道分配的单次优化问题。为了处理由于缺乏对主用户活动的先验知识以及缺乏信道接入协调器而导致的不确定性,将原始问题转化为马尔可夫博弈(MG)框架中的竞争与合作混合多智能体强化学习(CCH-MARL)问题。然后,通过将每个 CUAV 视为独立学习者(IL),提出了一种基于值迭代的 RL 算法,该算法具有上置信界- Hoeffding(UCB-H)策略搜索。为了解决维度诅咒问题,进一步扩展了 UCB-H 策略,使用了双深度 Q 网络(DDQN)。数值仿真表明,与基准算法(如普通 Q-学习和 DDQN 算法)相比,所提出的算法能够有效地收敛到稳定策略,并显著提高网络性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b75/8879711/b7a12f293640/sensors-22-01651-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验