用于解耦光网络安全访问控制的软演员-评论家算法与改进的图神经网络模型

Soft actor-critic algorithm and improved GNN model in secure access control of disaggregated optical networks.

作者信息

Zhao Zhenqian, Wang Yuhe

机构信息

School of Railway Transportation , Shannxi College of Communications Technology, Xi'an City, 710018, China.

Xian Isoftstone Technology Service Co., Ltd., Xi'an City, 710018, China.

出版信息

Sci Rep. 2025 Aug 11;15(1):29358. doi: 10.1038/s41598-025-15225-z.

To address the challenges of coordinated defense amid dynamic topology evolution and multidimensional security threats in decomposed optical networks, this study introduces the Graph-Entangled Security Actor-Critic (GESAC) model. GESAC is built on spatiotemporal modeling of evolving topologies and leverages a cross-layer spatiotemporal Graph Neural Network (GNN) to capture causal dependencies between optical path switching and access requests. Additionally, it enables adaptive delineation of security boundaries across multiple domains through federated representation learning. Within this framework, the Soft Actor-Critic (SAC) algorithm is employed to construct a policy optimization mechanism. By integrating entropy-guided multi-objective reinforcement learning, GESAC maps encoded network states to access control strategies, jointly optimizing for security, service quality, and system resilience. Experimental validation is conducted on a heterogeneous dataset comprising Cooperative Association for Internet Data Analysis (CAIDA) topology data, Canadian Institute for Cybersecurity Intrusion Detection Systems (CIC-IDS) access logs, and International Telecommunication Union Telecommunication Standardization Sector threat characteristics. The dataset encompasses 12 attack scenarios, 57,000 dynamic topology sequences, and 2.8 million cross-domain authentication events. Key findings include: (1) Threat Detection: GESAC achieves an F1-score of 0.915-0.931 in identifying physical-layer attacks such as wavelength eavesdropping and cross-domain privilege escalation, with a false positive rate as low as 0.7%. (2) Resource Optimization: Compared to greedy strategies, GESAC improves wavelength utilization variance by up to 58.9% and reduces end-to-end latency standard deviation by up to 57.7% under high-load conditions. (3) Policy Robustness: In scenarios involving topological mutations, the model increases Pareto frontier coverage by over 100% and reduces policy entropy decay rate by more than 65%, indicating strong robustness. (4) Scalability: At a scale of 100,000 network nodes, GESAC achieves a single-step decision latency of just 25.6µs and significantly reduces communication overhead, demonstrating excellent scalability. GESAC is designed to overcome the limitations of static security policies in the face of dynamic decomposition and large-scale attacks in optical networks. Integrating causal inference with game-theoretic equilibrium redefines the security control paradigm-shifting from passive defense to proactive resilience-and provides an interpretable, highly adaptive foundation for next-generation architectures such as multi-domain collaboration and computing-network convergence.

为应对分解光网络中动态拓扑演化和多维安全威胁下的协同防御挑战，本研究引入了图纠缠安全智能体-评论家（GESAC）模型。GESAC基于演化拓扑的时空建模构建，并利用跨层时空图神经网络（GNN）来捕获光路切换与访问请求之间的因果依赖关系。此外，它通过联邦表示学习实现跨多个域的安全边界自适应划分。在此框架内，采用软智能体-评论家（SAC）算法构建策略优化机制。通过整合熵引导的多目标强化学习，GESAC将编码后的网络状态映射到访问控制策略，同时针对安全性、服务质量和系统弹性进行联合优化。在一个由互联网数据分析合作协会（CAIDA）拓扑数据、加拿大网络安全研究所入侵检测系统（CIC-IDS）访问日志以及国际电信联盟电信标准化部门威胁特征组成的异构数据集上进行了实验验证。该数据集涵盖12种攻击场景、57000个动态拓扑序列以及280万个跨域认证事件。主要发现包括：（1）威胁检测：GESAC在识别诸如波长窃听和跨域权限提升等物理层攻击时，F1分数达到0.915 - 0.931，误报率低至0.7%。（2）资源优化：与贪婪策略相比，在高负载条件下，GESAC将波长利用率方差提高了58.9%，并将端到端延迟标准差降低了57.7%。（3）策略稳健性：在涉及拓扑突变的场景中，该模型将帕累托前沿覆盖率提高了100%以上，并将策略熵衰减率降低了65%以上，显示出强大的稳健性。（4）可扩展性：在100000个网络节点的规模下，GESAC的单步决策延迟仅为25.6微秒，并显著降低了通信开销，展现出出色的可扩展性。GESAC旨在克服光网络中面对动态分解和大规模攻击时静态安全策略的局限性。将因果推理与博弈论均衡相结合，重新定义了安全控制范式——从被动防御转向主动弹性——并为多域协作和计算-网络融合等下一代架构提供了一个可解释、高度自适应的基础。