• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于多智能体深度强化学习的设备到设备移动边缘计算网络中的动态任务卸载,以在有截止期限约束的情况下最小化平均任务延迟

Multi-Agent Deep Reinforcement Learning Based Dynamic Task Offloading in a Device-to-Device Mobile-Edge Computing Network to Minimize Average Task Delay with Deadline Constraints.

作者信息

He Huaiwen, Yang Xiangdong, Mi Xin, Shen Hong, Liao Xuefeng

机构信息

School of Computer, Zhongshan Institute, University of Electronic Science and Technology of China, Zhongshan 528400, China.

Computer Science and Engineering School, University of Electronic Science and Technology of China, Chengdu 611731, China.

出版信息

Sensors (Basel). 2024 Aug 8;24(16):5141. doi: 10.3390/s24165141.

DOI:10.3390/s24165141
PMID:39204838
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11359727/
Abstract

Device-to-device (D2D) is a pivotal technology in the next generation of communication, allowing for direct task offloading between mobile devices (MDs) to improve the efficient utilization of idle resources. This paper proposes a novel algorithm for dynamic task offloading between the active MDs and the idle MDs in a D2D-MEC (mobile edge computing) system by deploying multi-agent deep reinforcement learning (DRL) to minimize the long-term average delay of delay-sensitive tasks under deadline constraints. Our core innovation is a dynamic partitioning scheme for idle and active devices in the D2D-MEC system, accounting for stochastic task arrivals and multi-time-slot task execution, which has been insufficiently explored in the existing literature. We adopt a queue-based system to formulate a dynamic task offloading optimization problem. To address the challenges of large action space and the coupling of actions across time slots, we model the problem as a Markov decision process (MDP) and perform multi-agent DRL through multi-agent proximal policy optimization (MAPPO). We employ a centralized training with decentralized execution (CTDE) framework to enable each MD to make offloading decisions solely based on its local system state. Extensive simulations demonstrate the efficiency and fast convergence of our algorithm. In comparison to the existing sub-optimal results deploying single-agent DRL, our algorithm reduces the average task completion delay by 11.0% and the ratio of dropped tasks by 17.0%. Our proposed algorithm is particularly pertinent to sensor networks, where mobile devices equipped with sensors generate a substantial volume of data that requires timely processing to ensure quality of experience (QoE) and meet the service-level agreements (SLAs) of delay-sensitive applications.

摘要

设备到设备(D2D)是下一代通信中的一项关键技术,它允许移动设备(MD)之间直接进行任务卸载,以提高空闲资源的有效利用率。本文提出了一种新颖的算法,用于在D2D移动边缘计算(MEC)系统中,通过部署多智能体深度强化学习(DRL),在截止期限约束下,最小化对延迟敏感任务的长期平均延迟,实现活跃MD和空闲MD之间的动态任务卸载。我们的核心创新是D2D-MEC系统中针对空闲和活跃设备的动态分区方案,该方案考虑了随机任务到达和多时隙任务执行情况,而现有文献对此研究不足。我们采用基于队列的系统来构建动态任务卸载优化问题。为应对大动作空间以及跨时隙动作耦合的挑战,我们将该问题建模为马尔可夫决策过程(MDP),并通过多智能体近端策略优化(MAPPO)进行多智能体DRL。我们采用集中训练分散执行(CTDE)框架,使每个MD能够仅基于其本地系统状态做出卸载决策。大量仿真证明了我们算法的效率和快速收敛性。与现有部署单智能体DRL的次优结果相比,我们的算法将平均任务完成延迟降低了11.0%,将丢弃任务的比例降低了17.0%。我们提出的算法特别适用于传感器网络,在该网络中,配备传感器的移动设备会生成大量数据,需要及时处理以确保体验质量(QoE)并满足对延迟敏感应用的服务水平协议(SLA)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b63/11359727/ca44288d505f/sensors-24-05141-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b63/11359727/ce954e55fa8e/sensors-24-05141-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b63/11359727/2cd21d8aa134/sensors-24-05141-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b63/11359727/ca44288d505f/sensors-24-05141-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b63/11359727/ce954e55fa8e/sensors-24-05141-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b63/11359727/2cd21d8aa134/sensors-24-05141-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b63/11359727/ca44288d505f/sensors-24-05141-g005.jpg

相似文献

1
Multi-Agent Deep Reinforcement Learning Based Dynamic Task Offloading in a Device-to-Device Mobile-Edge Computing Network to Minimize Average Task Delay with Deadline Constraints.基于多智能体深度强化学习的设备到设备移动边缘计算网络中的动态任务卸载,以在有截止期限约束的情况下最小化平均任务延迟
Sensors (Basel). 2024 Aug 8;24(16):5141. doi: 10.3390/s24165141.
2
A Multi-Agent RL Algorithm for Dynamic Task Offloading in D2D-MEC Network with Energy Harvesting.一种用于具有能量收集功能的D2D-MEC网络中动态任务卸载的多智能体强化学习算法。
Sensors (Basel). 2024 Apr 26;24(9):2779. doi: 10.3390/s24092779.
3
D2D-Assisted Multi-User Cooperative Partial Offloading in MEC Based on Deep Reinforcement Learning.基于深度强化学习的MEC中D2D辅助多用户协作部分卸载
Sensors (Basel). 2022 Sep 15;22(18):7004. doi: 10.3390/s22187004.
4
DRL-OS: A Deep Reinforcement Learning-Based Offloading Scheduler in Mobile Edge Computing.DRL-OS:移动边缘计算中的基于深度强化学习的卸载调度器。
Sensors (Basel). 2022 Nov 26;22(23):9212. doi: 10.3390/s22239212.
5
Self-Adaptive Learning of Task Offloading in Mobile Edge Computing Systems.移动边缘计算系统中任务卸载的自适应学习
Entropy (Basel). 2021 Aug 31;23(9):1146. doi: 10.3390/e23091146.
6
Fuzzy Decision-Based Efficient Task Offloading Management Scheme in Multi-Tier MEC-Enabled Networks.基于模糊决策的多层边缘计算网络高效任务卸载管理方案
Sensors (Basel). 2021 Feb 20;21(4):1484. doi: 10.3390/s21041484.
7
Joint Optimization of Multi-User Partial Offloading Strategy and Resource Allocation Strategy in D2D-Enabled MEC.在支持 D2D 的移动边缘计算中,联合优化多用户部分卸载策略和资源分配策略。
Sensors (Basel). 2023 Feb 25;23(5):2565. doi: 10.3390/s23052565.
8
A Federated Learning and Deep Reinforcement Learning-Based Method with Two Types of Agents for Computation Offload.基于联邦学习和强化学习的两种类型代理的计算卸载方法。
Sensors (Basel). 2023 Feb 16;23(4):2243. doi: 10.3390/s23042243.
9
HAGP: A Heuristic Algorithm Based on Greedy Policy for Task Offloading with Reliability of MDs in MEC of the Industrial Internet.基于贪心策略的启发式算法用于工业互联网中移动边缘计算的 MDs 可靠性的任务卸载。
Sensors (Basel). 2021 May 18;21(10):3513. doi: 10.3390/s21103513.
10
Intelligent Task Dispatching and Scheduling Using a Deep Q-Network in a Cluster Edge Computing System.在集群边缘计算系统中使用深度Q网络的智能任务调度与分配
Sensors (Basel). 2022 May 28;22(11):4098. doi: 10.3390/s22114098.

本文引用的文献

1
A Multi-Agent RL Algorithm for Dynamic Task Offloading in D2D-MEC Network with Energy Harvesting.一种用于具有能量收集功能的D2D-MEC网络中动态任务卸载的多智能体强化学习算法。
Sensors (Basel). 2024 Apr 26;24(9):2779. doi: 10.3390/s24092779.