Yang Xu, Fang Hai, Gao Yuan, Wang Xingjie, Wang Kan, Liu Zheng
Xi'an Institute of Space Radio Technology, Xi'an 710100, China.
School of Computer Science and Engineering, Xi'an University of Technology, Xi'an 710048, China.
Sensors (Basel). 2023 Dec 17;23(24):9885. doi: 10.3390/s23249885.
Traditional low earth orbit (LEO) satellite networks are typically independent of terrestrial networks, which develop relatively slowly due to the on-board capacity limitation. By integrating emerging mobile edge computing (MEC) with LEO satellite networks to form the business-oriented "end-edge-cloud" multi-level computing architecture, some computing-sensitive tasks can be offloaded by ground terminals to satellites, thereby satisfying more tasks in the network. How to make computation offloading and resource allocation decisions in LEO satellite edge networks, nevertheless, indeed poses challenges in tracking network dynamics and handling sophisticated actions. For the discrete-continuous hybrid action space and time-varying networks, this work aims to use the parameterized deep Q-network (P-DQN) for the joint computation offloading and resource allocation. First, the characteristics of time-varying channels are modeled, and then both communication and computation models under three different offloading decisions are constructed. Second, the constraints on task offloading decisions, on remaining available computing resources, and on the power control of LEO satellites as well as the cloud server are formulated, followed by the maximization problem of satisfied task number over the long run. Third, using the parameterized action Markov decision process (PAMDP) and P-DQN, the joint computing offloading, resource allocation, and power control are made in real time, to accommodate dynamics in LEO satellite edge networks and dispose of the discrete-continuous hybrid action space. Simulation results show that the proposed P-DQN method could approach the optimal control, and outperforms other reinforcement learning (RL) methods for merely either discrete or continuous action space, in terms of the long-term rate of satisfied tasks.
传统的低地球轨道(LEO)卫星网络通常独立于地面网络,由于机载容量限制,地面网络发展相对缓慢。通过将新兴的移动边缘计算(MEC)与LEO卫星网络集成,形成面向业务的“端-边缘-云”多级计算架构,一些对计算敏感的任务可以由地面终端卸载到卫星上,从而满足网络中的更多任务。然而,如何在LEO卫星边缘网络中进行计算卸载和资源分配决策,在跟踪网络动态和处理复杂动作方面确实带来了挑战。针对离散-连续混合动作空间和时变网络,这项工作旨在使用参数化深度Q网络(P-DQN)进行联合计算卸载和资源分配。首先,对时变信道的特性进行建模,然后构建三种不同卸载决策下的通信和计算模型。其次,制定任务卸载决策、剩余可用计算资源、LEO卫星以及云服务器的功率控制方面的约束条件,接着是长期内满足任务数量最大化的问题。第三,使用参数化动作马尔可夫决策过程(PAMDP)和P-DQN,实时进行联合计算卸载、资源分配和功率控制,以适应LEO卫星边缘网络中的动态变化并处理离散-连续混合动作空间。仿真结果表明,所提出的P-DQN方法能够接近最优控制,并且在长期满足任务率方面优于仅适用于离散或连续动作空间的其他强化学习(RL)方法。