Uddin Md Ashraf, Aryal Sunil, Bouadjenek Mohamed Reda, Al-Hawawreh Muna, Talukder Md Alamin
School of Information Technology, Deakin University, Waurn Ponds Campus, Geelong, Australia.
Department of Computer Science and Engineering, International University of Business Agriculture and Technology, Dhaka, Bangladesh.
Sci Rep. 2024 Nov 24;14(1):29103. doi: 10.1038/s41598-024-80021-0.
The rapid expansion of varied network systems, including the Internet of Things (IoT) and the Industrial Internet of Things (IIoT), has led to an increasing range of cyber threats. Ensuring robust protection against these threats necessitates the implementation of an effective Intrusion Detection System (IDS). For more than a decade, researchers have delved into supervised machine learning techniques to develop IDS to classify normal and attack traffic. However, building effective IDS models using supervised learning requires a substantial number of benign and attack samples. To collect a sufficient number of attack samples from real-life scenarios is not possible since cyber attacks occur occasionally. Further, IDS trained and tested on known datasets fails in detecting zero-day or unknown attacks due to the swift evolution of attack patterns. To address this challenge, we put forth two strategies for semi-supervised learning-based IDS where training samples of attacks are not required: (1) training a supervised machine learning model using randomly and uniformly dispersed synthetic attack samples; (2) building a One Class Classification (OCC) model that is trained exclusively on benign network traffic. We have implemented both approaches and compared their performances using 10 recent benchmark IDS datasets. Our findings demonstrate that the OCC model based on the state-of-art anomaly detection technique called usfAD significantly outperforms conventional supervised classification and other OCC-based techniques when trained and tested considering real-life scenarios, particularly to detect previously unseen attacks.
包括物联网(IoT)和工业物联网(IIoT)在内的各种网络系统的迅速扩张,导致了网络威胁范围的不断扩大。确保对这些威胁的强大防护需要实施有效的入侵检测系统(IDS)。十多年来,研究人员深入研究监督机器学习技术,以开发用于对正常流量和攻击流量进行分类的IDS。然而,使用监督学习构建有效的IDS模型需要大量的良性和攻击样本。由于网络攻击偶尔发生,从现实场景中收集足够数量的攻击样本是不可能的。此外,在已知数据集上训练和测试的IDS由于攻击模式的迅速演变,无法检测到零日攻击或未知攻击。为应对这一挑战,我们提出了两种基于半监督学习的IDS策略,其中不需要攻击的训练样本:(1)使用随机且均匀分布的合成攻击样本训练监督机器学习模型;(2)构建仅在良性网络流量上训练的单类分类(OCC)模型。我们已经实施了这两种方法,并使用10个最新的基准IDS数据集比较了它们之间的性能。我们的研究结果表明,基于称为usfAD的最新异常检测技术的OCC模型在考虑现实场景进行训练和测试时,显著优于传统的监督分类和其他基于OCC的技术,特别是在检测以前未见过的攻击方面。