基于强化Q学习的无线传感器网络网络威胁缓解自适应加密模型

Reinforcement Q-Learning-Based Adaptive Encryption Model for Cyberthreat Mitigation in Wireless Sensor Networks.

作者信息

Premakumari Sreeja Balachandran Nair, Sundaram Gopikrishnan, Rivera Marco, Wheeler Patrick, Guzmán Ricardo E Pérez

机构信息

Department of Information Technology, Karpagam College of Engineering, Myleripalayam Village, Coimbatore 641032, Tamil Nadu, India.

School of Computer Science and Engineering, VIT-AP University, Amaravati 522241, Andhra Pradesh, India.

出版信息

Sensors (Basel). 2025 Mar 26;25(7):2056. doi: 10.3390/s25072056.

DOI:10.3390/s25072056

PMID:40218569

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11990953/

Abstract

The increasing prevalence of cyber threats in wireless sensor networks (WSNs) necessitates adaptive and efficient security mechanisms to ensure robust data transmission while addressing resource constraints. This paper proposes a reinforcement learning-based adaptive encryption framework that dynamically scales encryption levels based on real-time network conditions and threat classification. The proposed model leverages a deep learning-based anomaly detection system to classify network states into low, moderate, or high threat levels, which guides encryption policy selection. The framework integrates dynamic Q-learning for optimizing energy efficiency in low-threat conditions and double Q-learning for robust security adaptation in high-threat environments. A Hybrid Policy Derivation Algorithm is introduced to balance encryption complexity and computational overhead by dynamically switching between these learning models. The proposed system is formulated as a Markov Decision Process (MDP), where encryption level selection is driven by a reward function that optimizes the trade-off between energy efficiency and security robustness. The adaptive learning strategy employs an ϵ-greedy exploration-exploitation mechanism with an exponential decay rate to enhance convergence in dynamic WSN environments. The model also incorporates a dynamic hyperparameter tuning mechanism that optimally adjusts learning rates and exploration parameters based on real-time network feedback. Experimental evaluations conducted in a simulated WSN environment demonstrate the effectiveness of the proposed framework, achieving a 30.5% reduction in energy consumption, a 92.5% packet delivery ratio (PDR), and a 94% mitigation efficiency against multiple cyberattack scenarios, including DDoS, black-hole, and data injection attacks. Additionally, the framework reduces latency by 37% compared to conventional encryption techniques, ensuring minimal communication delays. These results highlight the scalability and adaptability of reinforcement learning-driven adaptive encryption in resource-constrained networks, paving the way for real-world deployment in next-generation IoT and WSN applications.

摘要

无线传感器网络（WSN）中网络威胁的日益普遍，需要自适应且高效的安全机制，以在解决资源限制的同时确保稳健的数据传输。本文提出了一种基于强化学习的自适应加密框架，该框架根据实时网络状况和威胁分类动态调整加密级别。所提出的模型利用基于深度学习的异常检测系统将网络状态分类为低、中或高威胁级别，以此指导加密策略的选择。该框架集成了动态Q学习以在低威胁条件下优化能源效率，并集成了双Q学习以在高威胁环境中实现稳健的安全适应。引入了一种混合策略推导算法，通过在这些学习模型之间动态切换来平衡加密复杂性和计算开销。所提出的系统被表述为一个马尔可夫决策过程（MDP），其中加密级别选择由一个奖励函数驱动，该函数优化能源效率和安全稳健性之间的权衡。自适应学习策略采用具有指数衰减率的ϵ-贪婪探索-利用机制，以增强在动态WSN环境中的收敛性。该模型还包含一个动态超参数调整机制，可根据实时网络反馈最优地调整学习率和探索参数。在模拟WSN环境中进行的实验评估证明了所提出框架的有效性，实现了能耗降低30.5%、数据包交付率（PDR）达到92.5%，以及针对包括DDoS、黑洞和数据注入攻击在内的多种网络攻击场景的缓解效率达到94%。此外，与传统加密技术相比，该框架将延迟降低了37%，确保了最小的通信延迟。这些结果突出了强化学习驱动的自适应加密在资源受限网络中的可扩展性和适应性，为在下一代物联网和WSN应用中的实际部署铺平了道路。