• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于无人机支持的物联网网络中地空通信规划的多智能体深度强化学习

Multi-Agent DRL for Air-to-Ground Communication Planning in UAV-Enabled IoT Networks.

作者信息

Qureshi Khalid Ibrahim, Lu Bingxian, Lu Cheng, Lodhi Muhammad Ali, Wang Lei

机构信息

Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province, School of Software, Dalian University of Technology, Dalian 116024, China.

出版信息

Sensors (Basel). 2024 Oct 10;24(20):6535. doi: 10.3390/s24206535.

DOI:10.3390/s24206535
PMID:39460016
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11511505/
Abstract

In this paper, we present a novel method to enhance the sum-rate effectiveness in full-duplex unmanned aerial vehicle (UAV)-assisted communication networks. Existing approaches often couple uplink and downlink associations, resulting in suboptimal performance, particularly in dynamic environments where user demands and network conditions are unpredictable. To overcome these limitations, we propose a decoupling of uplink and downlink associations for ground-based users (GBUs), significantly improving network efficiency. We formulate a comprehensive optimization problem that integrates UAV trajectory design and user association, aiming to maximize the overall sum-rate efficiency of the network. Due to the problem's non-convexity, we reformulate it as a Partially Observable Markov Decision Process (POMDP), enabling UAVs to make real-time decisions based on local observations without requiring complete global information. Our framework employs multi-agent deep reinforcement learning (MADRL), specifically the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm, which balances centralized training with distributed execution. This allows UAVs to efficiently learn optimal user associations and trajectory controls while dynamically adapting to local conditions. The proposed solution is particularly suited for critical applications such as disaster response and search and rescue missions, highlighting the practical significance of utilizing UAVs for rapid network deployment in emergencies. By addressing the limitations of existing centralized and distributed solutions, our hybrid model combines the benefits of centralized training with the adaptability of distributed inference, ensuring optimal UAV operations in real-time scenarios.

摘要

在本文中,我们提出了一种新颖的方法,以提高全双工无人机辅助通信网络中的和速率效率。现有方法通常将上行链路和下行链路关联耦合在一起,导致性能次优,尤其是在用户需求和网络状况不可预测的动态环境中。为克服这些限制,我们针对地面用户(GBU)提出了上行链路和下行链路关联的解耦方法,显著提高了网络效率。我们制定了一个综合优化问题,该问题整合了无人机轨迹设计和用户关联,旨在最大化网络的整体和速率效率。由于该问题的非凸性,我们将其重新表述为部分可观测马尔可夫决策过程(POMDP),使无人机能够基于局部观测做出实时决策,而无需完整的全局信息。我们的框架采用多智能体深度强化学习(MADRL),具体为多智能体深度确定性策略梯度(MADDPG)算法,该算法在集中式训练和分布式执行之间取得平衡。这使得无人机能够在动态适应局部条件的同时,高效地学习最优用户关联和轨迹控制。所提出的解决方案特别适用于灾难响应和搜索救援任务等关键应用,突出了在紧急情况下利用无人机进行快速网络部署的实际意义。通过解决现有集中式和分布式解决方案的局限性,我们的混合模型结合了集中式训练的优势和分布式推理的适应性,确保无人机在实时场景中实现最优操作。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6578/11511505/67ae4dbadfb3/sensors-24-06535-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6578/11511505/c6c2c07321f1/sensors-24-06535-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6578/11511505/5148ae1d2631/sensors-24-06535-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6578/11511505/6b7106a8f2b5/sensors-24-06535-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6578/11511505/7e6baa9a4cdb/sensors-24-06535-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6578/11511505/ff9783e2a338/sensors-24-06535-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6578/11511505/91b7eea681a6/sensors-24-06535-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6578/11511505/82bce16ff006/sensors-24-06535-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6578/11511505/7b2d48e7a7c3/sensors-24-06535-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6578/11511505/67ae4dbadfb3/sensors-24-06535-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6578/11511505/c6c2c07321f1/sensors-24-06535-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6578/11511505/5148ae1d2631/sensors-24-06535-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6578/11511505/6b7106a8f2b5/sensors-24-06535-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6578/11511505/7e6baa9a4cdb/sensors-24-06535-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6578/11511505/ff9783e2a338/sensors-24-06535-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6578/11511505/91b7eea681a6/sensors-24-06535-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6578/11511505/82bce16ff006/sensors-24-06535-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6578/11511505/7b2d48e7a7c3/sensors-24-06535-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6578/11511505/67ae4dbadfb3/sensors-24-06535-g009.jpg

相似文献

1
Multi-Agent DRL for Air-to-Ground Communication Planning in UAV-Enabled IoT Networks.用于无人机支持的物联网网络中地空通信规划的多智能体深度强化学习
Sensors (Basel). 2024 Oct 10;24(20):6535. doi: 10.3390/s24206535.
2
Multi-Objective Optimization in Air-to-Air Communication System Based on Multi-Agent Deep Reinforcement Learning.基于多智能体深度强化学习的空空通信系统多目标优化
Sensors (Basel). 2023 Nov 30;23(23):9541. doi: 10.3390/s23239541.
3
Multi-UAV simultaneous target assignment and path planning based on deep reinforcement learning in dynamic multiple obstacles environments.动态多障碍物环境下基于深度强化学习的多无人机同步目标分配与路径规划
Front Neurorobot. 2024 Jan 22;17:1302898. doi: 10.3389/fnbot.2023.1302898. eCollection 2023.
4
Power Allocation and Energy Cooperation for UAV-Enabled MmWave Networks: A Multi-Agent Deep Reinforcement Learning Approach.无人机增强毫米波网络的功率分配和能量合作:一种多智能体深度强化学习方法。
Sensors (Basel). 2021 Dec 30;22(1):270. doi: 10.3390/s22010270.
5
Trajectory optimization of UAV-IRS assisted 6G THz network using deep reinforcement learning approach.基于深度强化学习方法的无人机-智能反射面辅助6G太赫兹网络轨迹优化
Sci Rep. 2024 Aug 9;14(1):18501. doi: 10.1038/s41598-024-68459-8.
6
A novel energy-efficiency framework for UAV-assisted networks using adaptive deep reinforcement learning.一种使用自适应深度强化学习的无人机辅助网络新型能效框架。
Sci Rep. 2024 Sep 27;14(1):22188. doi: 10.1038/s41598-024-71621-x.
7
Searching and Tracking an Unknown Number of Targets: A Learning-Based Method Enhanced with Maps Merging.搜索和跟踪未知数量的目标:一种基于学习并通过地图合并增强的方法。
Sensors (Basel). 2021 Feb 4;21(4):1076. doi: 10.3390/s21041076.
8
Unmanned Aerial Vehicle Cooperative Data Dissemination Based on Graph Neural Networks.基于图神经网络的无人机协作数据传播
Sensors (Basel). 2024 Jan 30;24(3):887. doi: 10.3390/s24030887.
9
Deep Learning-Based Link Quality Estimation for RIS-Assisted UAV-Enabled Wireless Communications System.基于深度学习的RIS辅助无人机无线通信系统链路质量估计
Sensors (Basel). 2023 Sep 23;23(19):8041. doi: 10.3390/s23198041.
10
Proactive Handover Decision for UAVs with Deep Reinforcement Learning.基于深度强化学习的无人机主动交接决策
Sensors (Basel). 2022 Feb 5;22(3):1200. doi: 10.3390/s22031200.

引用本文的文献

1
Decentralized resource allocation in UAV communication networks through reward based multi agent learning.通过基于奖励的多智能体学习实现无人机通信网络中的分布式资源分配
Sci Rep. 2025 Sep 26;15(1):33122. doi: 10.1038/s41598-025-18353-8.