Strickland Caroline, Zakar Muhammad, Saha Chandrika, Soltani Nejad Sareh, Tasnim Noshin, Lizotte Daniel J, Haque Anwar
Department of Computer Science, The University of Western Ontario, London, ON N6A 3K7, Canada.
Sensors (Basel). 2024 Apr 25;24(9):2746. doi: 10.3390/s24092746.
Our increasingly connected world continues to face an ever-growing number of network-based attacks. An Intrusion Detection System (IDS) is an essential security technology used for detecting these attacks. Although numerous Machine Learning-based IDSs have been proposed for the detection of malicious network traffic, the majority have difficulty properly detecting and classifying the more uncommon attack types. In this paper, we implement a novel hybrid technique using synthetic data produced by a Generative Adversarial Network (GAN) to use as input for training a Deep Reinforcement Learning (DRL) model. Our GAN model is trained on the NSL-KDD dataset, a publicly available collection of labeled network traffic data specifically designed to support the evaluation and benchmarking of IDSs. Ultimately, our findings demonstrate that training the DRL model on synthetic datasets generated by specific GAN models can result in better performance in correctly classifying minority classes over training on the true imbalanced dataset.
我们这个日益互联的世界持续面临越来越多基于网络的攻击。入侵检测系统(IDS)是用于检测这些攻击的一项关键安全技术。尽管已经提出了许多基于机器学习的入侵检测系统来检测恶意网络流量,但大多数系统在正确检测和分类较不常见的攻击类型方面存在困难。在本文中,我们实现了一种新颖的混合技术,使用生成对抗网络(GAN)生成的合成数据作为训练深度强化学习(DRL)模型的输入。我们的GAN模型在NSL-KDD数据集上进行训练,该数据集是一个公开可用的带标签网络流量数据集合,专门设计用于支持入侵检测系统的评估和基准测试。最终,我们的研究结果表明,在特定GAN模型生成的合成数据集上训练DRL模型,相较于在真实的不平衡数据集上训练,在正确分类少数类方面能产生更好的性能。