基于深度强化学习的低地球轨道卫星智能分层准入控制

Intelligent Hierarchical Admission Control for Low-Earth Orbit Satellites Based on Deep Reinforcement Learning.

作者信息

Wei Debin, Guo Chuanqi, Yang Li

机构信息

Communication and Network Laboratory, Dalian University, Dalian 116622, China.

School of Automation, Nanjing University of Science and Technology, Nanjing 210094, China.

出版信息

Sensors (Basel). 2023 Oct 14;23(20):8470. doi: 10.3390/s23208470.

DOI:10.3390/s23208470

PMID:37896563

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10611023/

Abstract

Low-Earth orbit (LEO) satellites have limited on-board resources, user terminals are unevenly distributed in the constantly changing coverage area, and the service requirements vary significantly. It is urgent to optimize resource allocation under the constraint of limited satellite spectrum resources and ensure the fairness of service admission control. Therefore, we propose an intelligent hierarchical admission control (IHAC) strategy based on deep reinforcement learning (DRL). This strategy combines the deep deterministic policy gradient (DDPG) and the deep Q network (DQN) intelligent algorithm to construct upper and lower hierarchical resource allocation and admission control frameworks. The upper controller considers the state features of each ground zone and satellite resources from a global perspective, and determines the beam resource allocation ratio of each ground zone. The lower controller formulates the admission control policy based on the decision of the upper controller and the detailed information of the users' services. At the same time, a designed reward and punishment mechanism is used to optimize the decisions of the upper and lower controllers. The fairness of users' services admissions in each ground zone is achieved as far as possible while ensuring the reasonable allocation of beam resources among zones. Finally, online decision-making and offline learning were combined, so that the controller could make full use of a large number of historical data to learn and generate intelligent strategies with stronger adaptive ability while interacting with the network environment in real time. A large number of simulation results show that IHAC has better performance in terms of a successful service admission rate, service drop rate, and fair resource allocation. Among them, the number of accepted services increased by 20.36% on average, the packet loss rate decreased by 17.56% on average, and the resource fairness increased by 17.16% on average.

摘要

低地球轨道（LEO）卫星的机载资源有限，用户终端在不断变化的覆盖区域内分布不均，且服务需求差异显著。在卫星频谱资源有限的约束下优化资源分配并确保服务准入控制的公平性迫在眉睫。因此，我们提出了一种基于深度强化学习（DRL）的智能分层准入控制（IHAC）策略。该策略结合深度确定性策略梯度（DDPG）和深度Q网络（DQN）智能算法，构建上下分层的资源分配和准入控制框架。上层控制器从全局角度考虑每个地面区域和卫星资源的状态特征，确定每个地面区域的波束资源分配比例。下层控制器根据上层控制器的决策和用户服务的详细信息制定准入控制策略。同时，采用设计的奖惩机制优化上下层控制器的决策。在确保各区域波束资源合理分配的同时，尽可能实现每个地面区域用户服务准入的公平性。最后，将在线决策与离线学习相结合，使控制器能够在与网络环境实时交互的同时，充分利用大量历史数据进行学习，生成具有更强自适应能力的智能策略。大量仿真结果表明，IHAC在成功服务准入率、服务丢弃率和公平资源分配方面具有更好的性能。其中，接受服务的数量平均增加了20.36%，丢包率平均降低了17.56%，资源公平性平均提高了17.16%。