Suppr超能文献

一种使用XGBoost和长短期记忆网络(LSTM)在工业物联网中进行异常检测的有效方法。

An effective method for anomaly detection in industrial Internet of Things using XGBoost and LSTM.

作者信息

Chen Zhen, Li ZhenWan, Huang Jia, Liu ShengZheng, Long HaiXia

机构信息

College of Information Science Technology, Hainan Normal University, No. 99 LongKun South Road, Haikou city, 571158, Hainan Province, China.

Key Laboratory of Data Science and Smart Education, Ministry of Education, Hainan Normal University, Haikou city, 571158, Hainan Province, China.

出版信息

Sci Rep. 2024 Oct 14;14(1):23969. doi: 10.1038/s41598-024-74822-6.

Abstract

In recent years, with the application of Internet of Things (IoT) and cloud technology in smart industrialization, Industrial Internet of Things (IIoT) has become an emerging hot topic. The increasing amount of data and device numbers in IIoT poses significant challenges to its security issues, making anomaly detection particularly important. Existing methods for anomaly detection in the IIoT often fall short when dealing with data imbalance, and the huge amount of IIoT data makes feature selection challenging and computationally intensive. In this paper, we propose an optimal deep learning model for anomaly detection in IIoT. Firstly, by setting different thresholds of eXtreme Gradient Boosting (XGBoost) for feature selection, features with importance above the given threshold are retained, while those below are ignored. Different thresholds yield different numbers of features. This approach not only secures effective features but also reduces the feature dimensionality, thereby decreasing the consumption of computational resources. Secondly, an optimized loss function is designed to study its impact on model performance in terms of handling imbalanced data, highly similar categories, and model training. We select the optimal threshold and loss function, which are part of our optimal model, by comparing metrics such as accuracy, precision, recall, False Alarm Rate (FAR), Area Under the Receiver Operating Characteristic Curve (AUC-ROC), and Area Under the Precision-Recall Curve (AUC-PR) values. Finally, combining the optimal threshold and loss function, we propose a model named MIX_LSTM for anomaly detection in IIoT. Experiments are conducted using the UNSW-NB15 and NSL-KDD datasets. The proposed MIX_LSTM model can achieve 0.084 FAR, 0.984 AUC-ROC, and 0.988 AUC-PR values in the binary anomaly detection experiment on the UNSW-NB15 dataset. In the NSL-KDD dataset, it can achieve 0.028 FAR, 0.967 AUC-ROC, and 0.962 AUC-PR values. By comparing the evaluation indicators, the model shows good performance in detecting abnormal attacks in the Industrial Internet of Things compared with traditional deep learning models, machine learning models and existing technologies.

摘要

近年来,随着物联网(IoT)和云技术在智能工业化中的应用,工业物联网(IIoT)已成为一个新兴的热门话题。工业物联网中数据量和设备数量的不断增加对其安全问题提出了重大挑战,使得异常检测尤为重要。现有的工业物联网异常检测方法在处理数据不平衡时往往存在不足,而且工业物联网的海量数据使得特征选择具有挑战性且计算量很大。在本文中,我们提出了一种用于工业物联网异常检测的最优深度学习模型。首先,通过为特征选择设置不同的极端梯度提升(XGBoost)阈值,保留重要性高于给定阈值的特征,而忽略低于该阈值的特征。不同的阈值会产生不同数量的特征。这种方法不仅确保了有效特征,还降低了特征维度,从而减少了计算资源的消耗。其次,设计了一个优化的损失函数,以研究其在处理不平衡数据、高度相似类别和模型训练方面对模型性能的影响。我们通过比较准确率、精确率、召回率、误报率(FAR)、接收者操作特征曲线下面积(AUC - ROC)和精确率 - 召回率曲线下面积(AUC - PR)值等指标,选择最优阈值和损失函数,它们是我们最优模型的一部分。最后,结合最优阈值和损失函数,我们提出了一个名为MIX_LSTM的工业物联网异常检测模型。使用UNSW - NB15和NSL - KDD数据集进行了实验。所提出的MIX_LSTM模型在UNSW - NB15数据集的二元异常检测实验中可以达到0.084的误报率、0.984的AUC - ROC和0.988的AUC - PR值。在NSL - KDD数据集中,它可以达到0.028的误报率、0.967的AUC - ROC和0.962的AUC - PR值。通过比较评估指标,与传统深度学习模型、机器学习模型和现有技术相比,该模型在检测工业物联网中的异常攻击方面表现出良好的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5de/11471804/cf662b2ad6fc/41598_2024_74822_Fig24_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验