Wang Zhen, Wang Yongjie, Xiong Xinli, Ren Qiankun, Huang Jun
College of Electronic Engineering, National University of Defense Technology, Hefei 230037, China.
Anhui Province Key Laboratory of Cyberspace Security Situation Awareness and Evaluation, Hefei 230037, China.
Entropy (Basel). 2025 Jun 11;27(6):622. doi: 10.3390/e27060622.
Faced with challenges posed by sophisticated cyber attacks and dynamic characteristics of cyberspace, the autonomous cyber defense (ACD) technology has shown its effectiveness. However, traditional decision-making methods for ACD are unable to effectively characterize the network topology and internode dependencies, which makes it difficult for defenders to identify key nodes and critical attack paths. Therefore, this paper proposes an enhanced decision-making method combining graph embedding with reinforcement learning algorithms. By constructing a game model for cyber confrontations, this paper models important elements of the network topology for decision-making, which guide the defender to dynamically optimize its strategy based on topology awareness. We improve the reinforcement learning with the Node2vec algorithm to characterize information for the defender from the network. And, node attributes and network structural features are embedded into low-dimensional vectors instead of using traditional one-hot encoding, which can address the perceptual bottleneck in high-dimensional sparse environments. Meanwhile, the algorithm training environment Cyberwheel is extended by adding new fine-grained defense mechanisms to enhance the utility and portability of ACD. In experiments, our decision-making method based on graph embedding is compared and analyzed with traditional perception methods. The results show and verify the superior performance of our approach in the strategy selection of defensive decision-making. Also, diverse parameters of the graph representation model Node2vec are analyzed and compared to find the impact on the enhancement of the embedding effectiveness for the decision-making of ACD.
面对复杂网络攻击带来的挑战以及网络空间的动态特性,自主网络防御(ACD)技术已展现出其有效性。然而,传统的ACD决策方法无法有效刻画网络拓扑结构和节点间依赖关系,这使得防御者难以识别关键节点和关键攻击路径。因此,本文提出一种将图嵌入与强化学习算法相结合的增强决策方法。通过构建网络对抗博弈模型,本文对用于决策的网络拓扑重要元素进行建模,引导防御者基于拓扑感知动态优化其策略。我们使用Node2vec算法改进强化学习,以便从网络中为防御者刻画信息。并且,将节点属性和网络结构特征嵌入到低维向量中,而非使用传统的独热编码,这能够解决高维稀疏环境中的感知瓶颈问题。同时,通过添加新的细粒度防御机制扩展算法训练环境Cyberwheel,以增强ACD的实用性和可移植性。在实验中,将我们基于图嵌入的决策方法与传统感知方法进行比较和分析。结果表明并验证了我们的方法在防御决策策略选择方面的优越性能。此外,对图表示模型Node2vec的不同参数进行分析和比较,以找出其对增强ACD决策嵌入有效性的影响。