Zhao Hongwei, Li Xuyan, Li Chengrui, Yao Lu
Department of Intelligent Science and Information Engineering, Shenyang University, Shenyang 110000, China.
Sensors (Basel). 2025 Aug 6;25(15):4838. doi: 10.3390/s25154838.
Efficient task offloading for delay-sensitive applications, such as autonomous driving, presents a significant challenge in multi-hop Vehicular Edge Computing (VEC) networks, primarily due to high vehicle mobility, dynamic network topologies, and complex end-to-end congestion problems. To address these issues, this paper proposes a graph attention-based reinforcement learning algorithm, named GAPO. The algorithm models the dynamic VEC network as an attributed graph and utilizes a graph neural network (GNN) to learn a network state representation that captures the global topological structure and node contextual information. Building on this foundation, an attention-based Actor-Critic framework makes joint offloading decisions by intelligently selecting the optimal destination and collaboratively determining the ratios for offloading and resource allocation. A multi-objective reward function, designed to minimize task latency and to alleviate link congestion, guides the entire learning process. Comprehensive simulation experiments and ablation studies show that, compared to traditional heuristic algorithms and standard deep reinforcement learning methods, GAPO significantly reduces average task completion latency and substantially decreases backbone link congestion. In conclusion, by deeply integrating the state-aware capabilities of GNNs with the decision-making abilities of DRL, GAPO provides an efficient, adaptive, and congestion-aware solution to the resource management problems in dynamic VEC environments.
对于诸如自动驾驶等对延迟敏感的应用而言,高效的任务卸载在多跳车载边缘计算(VEC)网络中是一项重大挑战,主要原因在于车辆的高移动性、动态的网络拓扑结构以及复杂的端到端拥塞问题。为解决这些问题,本文提出了一种基于图注意力的强化学习算法,名为GAPO。该算法将动态VEC网络建模为一个属性图,并利用图神经网络(GNN)来学习一种网络状态表示,这种表示能够捕捉全局拓扑结构和节点上下文信息。在此基础上,基于注意力的演员-评论家框架通过智能选择最优目的地并协同确定卸载和资源分配的比例来做出联合卸载决策。一个旨在最小化任务延迟并缓解链路拥塞的多目标奖励函数引导着整个学习过程。全面的仿真实验和消融研究表明,与传统启发式算法和标准深度强化学习方法相比,GAPO显著降低了平均任务完成延迟,并大幅减少了骨干链路拥塞。总之,通过将GNN的状态感知能力与深度强化学习(DRL)的决策能力深度集成,GAPO为动态VEC环境中的资源管理问题提供了一种高效、自适应且拥塞感知的解决方案。