基于LightGBM和使用改进的樽海鞘群算法进行特征选择的大规模物联网攻击检测方案

Large-scale IoT attack detection scheme based on LightGBM and feature selection using an improved salp swarm algorithm.

作者信息

Chen Weizhe, Yang Hongyu, Yin Lihua, Luo Xi

机构信息

Cyberspace Institute of Advanced Technology, Guangzhou University, Guangzhou, 510006, China.

出版信息

Sci Rep. 2024 Aug 19;14(1):19165. doi: 10.1038/s41598-024-69968-2.

DOI:10.1038/s41598-024-69968-2

PMID:39160210

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11333491/

Abstract

Due to the swift advancement of the Internet of Things (IoT), there has been a significant surge in the quantity of interconnected IoT devices that send and exchange vital data across the network. Nevertheless, the frequency of attacks on the Internet of Things is steadily rising, posing a persistent risk to the security and privacy of IoT data. Therefore, it is crucial to develop a highly efficient method for detecting cyber threats on the Internet of Things. Nevertheless, several current network attack detection schemes encounter issues such as insufficient detection accuracy, the curse of dimensionality due to excessively high data dimensions, and the sluggish efficiency of complex models. Employing metaheuristic algorithms for feature selection in network data represents an effective strategy among the myriad of solutions. This study introduces a more comprehensive metaheuristic algorithm called GQBWSSA, which is an enhanced version of the Salp Swarm Algorithm with several strategy improvements. Utilizing this algorithm, a threshold voting-based feature selection framework is designed to obtain an optimized set of features. This procedure efficiently decreases the number of dimensions in the data, hence preventing the negative effects of having a high number of dimensions and effectively extracting the most significant and crucial information. Subsequently, the extracted feature data is combined with the LightGBM algorithm to form a lightweight and efficient ensemble learning scheme for IoT attack detection. The proposed enhanced metaheuristic algorithm has superior performance in feature selection compared to the recent metaheuristic algorithms, as evidenced by the experimental evaluation conducted using the NSLKDD and CICIoT2023 datasets. Compared to current popular ensemble learning solutions, the proposed overall solution exhibits excellent performance on multiple key indicators, including accuracy, precision, as well as training and detection time. Especially on the large-scale dataset CICIoT2023, the proposed scheme achieves an accuracy rate of 99.70% in binary classification and 99.41% in multi classification.

摘要

由于物联网（IoT）的迅速发展，相互连接的物联网设备数量大幅增加，这些设备在网络中发送和交换重要数据。然而，针对物联网的攻击频率正在稳步上升，对物联网数据的安全和隐私构成持续风险。因此，开发一种高效的物联网网络威胁检测方法至关重要。然而，当前的几种网络攻击检测方案存在检测精度不足、数据维度过高导致的维度灾难以及复杂模型效率低下等问题。在众多解决方案中，采用元启发式算法进行网络数据特征选择是一种有效的策略。本研究引入了一种更全面的元启发式算法GQBWSSA，它是对Salp Swarm算法进行了多项策略改进的增强版本。利用该算法，设计了一种基于阈值投票的特征选择框架，以获得优化的特征集。这一过程有效地减少了数据维度，从而避免了高维度带来的负面影响，并有效地提取了最重要和关键的信息。随后，将提取的特征数据与LightGBM算法相结合，形成一种轻量级且高效的物联网攻击检测集成学习方案。使用NSLKDD和CICIoT2023数据集进行的实验评估表明，所提出的增强元启发式算法在特征选择方面具有优于近期元启发式算法的性能。与当前流行的集成学习解决方案相比，所提出的整体解决方案在包括准确率、精确率以及训练和检测时间等多个关键指标上表现出色。特别是在大规模数据集CICIoT2023上，所提出的方案在二分类中实现了99.70%的准确率，在多分类中实现了99.41%的准确率。