Suppr超能文献

一种基于特征选择和大语言模型微调的物联网入侵检测框架。

An IoT intrusion detection framework based on feature selection and large language models fine-tuning.

作者信息

Ma Huan, Zhang Wan, Zhang Dalong, Chen Baozhan

机构信息

School of Cyber Science and Engineering, Zhengzhou University, Zhengzhou, 450001, China.

Software Engineering College, Zhengzhou University of Light Industry, Zhengzhou, 450066, China.

出版信息

Sci Rep. 2025 Jul 1;15(1):21158. doi: 10.1038/s41598-025-08905-3.

Abstract

The rapid proliferation of Internet of Things (IoT) devices has significantly expanded the network attack surface, necessitating the deployment of advanced AI (artificial intelligence)-based intrusion detection systems (IDS) to bolster IoT security. But existing methods face two significant challenges: (1) Feature redundancy: Current approaches extract numerous flow-level features to learn attack behavior, resulting in high computational complexity and substantial redundant information. (2) Class imbalance: Limited attack traffic samples hinder models from effectively learning attack patterns. However, existing algorithms typically address only one of these issues, overlooking their interconnection. Therefore, we propose a Feature Selection and Large Language Models (LLMs)-based IoT intrusion detection framework (FSLLM). At its core is a multi-stage feature selection algorithm combining Minimum Redundancy Maximum Relevance algorithm (mRMR) and a Pearson Correlation Coefficient (PCC)-improved Covariance Matrix Adaptation Evolution Strategy algorithm (CMA-ES). This algorithm utilizes the CMA-ES algorithm for feature search while also taking into account the mutual information and collinearity among features, thereby more effectively reducing redundancy features. Subsequently, we employ the selected representative features to fine-tune LLMs and generate additional attack samples. This approach effectively reduces the computational cost of fine-tuning while producing higher-quality samples. Furthermore, we employ Focal Loss (FL) function-improved LightGBM as the classifier to improve detection performance. We evaluate our framework on five IoT intrusion detection datasets: NF-ToN-IoT-v2, NF-UNSW-NB15-v2, NF-BoT-IoT-v2, NF-CSE-CIC-IDS2018-v2, and CIC-ToN-IoT. Experimental results demonstrate that FSLLM achieves comparable or superior accuracy to current state-of-the-art methods while reducing redundant features by over 80%.

摘要

物联网(IoT)设备的迅速激增显著扩大了网络攻击面,因此需要部署基于先进人工智能(AI)的入侵检测系统(IDS)来加强物联网安全。但现有方法面临两个重大挑战:(1)特征冗余:当前方法提取大量流级特征来学习攻击行为,导致计算复杂度高且存在大量冗余信息。(2)类不平衡:有限的攻击流量样本阻碍模型有效学习攻击模式。然而,现有算法通常只解决其中一个问题,而忽略了它们之间的相互联系。因此,我们提出了一种基于特征选择和大语言模型(LLMs)的物联网入侵检测框架(FSLLM)。其核心是一种多阶段特征选择算法,该算法结合了最小冗余最大相关算法(mRMR)和经皮尔逊相关系数(PCC)改进的协方差矩阵自适应进化策略算法(CMA-ES)。该算法利用CMA-ES算法进行特征搜索,同时考虑特征之间的互信息和共线性,从而更有效地减少冗余特征。随后,我们使用选定的代表性特征对大语言模型进行微调并生成额外的攻击样本。这种方法有效降低了微调的计算成本,同时生成了更高质量的样本。此外,我们采用经焦点损失(FL)函数改进的LightGBM作为分类器来提高检测性能。我们在五个物联网入侵检测数据集上评估了我们的框架:NF-ToN-IoT-v2、NF-UNSW-NB15-v2、NF-BoT-IoT-v2、NF-CSE-CIC-IDS2018-v2和CIC-ToN-IoT。实验结果表明,FSLLM在减少80%以上冗余特征的同时,实现了与当前最先进方法相当或更高的准确率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6aca/12218583/6b12e9df8025/41598_2025_8905_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验