• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于基于机器学习的网络入侵检测系统的受污染训练数据集检测方法。

Methodology for the Detection of Contaminated Training Datasets for Machine Learning-Based Network Intrusion-Detection Systems.

作者信息

Medina-Arco Joaquín Gaspar, Magán-Carrión Roberto, Rodríguez-Gómez Rafael Alejandro, García-Teodoro Pedro

机构信息

Network Engineering & Security Group (NESG), University of Granada, 18012 Granada, Spain.

出版信息

Sensors (Basel). 2024 Jan 12;24(2):479. doi: 10.3390/s24020479.

DOI:10.3390/s24020479
PMID:38257574
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10819357/
Abstract

With the significant increase in cyber-attacks and attempts to gain unauthorised access to systems and information, Network Intrusion-Detection Systems (NIDSs) have become essential detection tools. Anomaly-based systems use machine learning techniques to distinguish between normal and anomalous traffic. They do this by using training datasets that have been previously gathered and labelled, allowing them to learn to detect anomalies in future data. However, such datasets can be accidentally or deliberately contaminated, compromising the performance of NIDS. This has been the case of the UGR'16 dataset, in which, during the labelling process, botnet-type attacks were not identified in the subset intended for training. This paper addresses the mislabelling problem of real network traffic datasets by introducing a novel methodology that (i) allows analysing the quality of a network traffic dataset by identifying possible hidden or unidentified anomalies and (ii) selects the ideal subset of data to optimise the performance of the anomaly detection model even in the presence of hidden attacks erroneously labelled as normal network traffic. To this end, a two-step process that makes incremental use of the training dataset is proposed. Experiments conducted on the contaminated UGR'16 dataset in conjunction with the state-of-the-art NIDS, Kitsune, conclude with the feasibility of the approach to reveal observations of hidden botnet-based attacks on this dataset.

摘要

随着网络攻击以及未经授权访问系统和信息的企图显著增加,网络入侵检测系统(NIDS)已成为必不可少的检测工具。基于异常的系统使用机器学习技术来区分正常流量和异常流量。它们通过使用先前收集并标记的训练数据集来做到这一点,从而使它们能够学会检测未来数据中的异常。然而,这样的数据集可能会被意外或故意污染,从而损害NIDS的性能。UGR'16数据集就是这种情况,在该数据集中,在标记过程中,在用于训练的子集中未识别出僵尸网络类型的攻击。本文通过引入一种新颖的方法来解决真实网络流量数据集的错误标记问题,该方法(i)通过识别可能隐藏或未识别的异常来分析网络流量数据集的质量,并且(ii)选择理想的数据子集以优化异常检测模型的性能,即使存在被错误标记为正常网络流量的隐藏攻击。为此,提出了一个逐步使用训练数据集的两步过程。结合最先进的NIDS Kitsune对受污染的UGR'16数据集进行的实验得出结论,该方法可揭示此数据集上基于隐藏僵尸网络攻击的观测结果,具有可行性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dda/10819357/ef46d9922981/sensors-24-00479-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dda/10819357/33caaad2e947/sensors-24-00479-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dda/10819357/dd58302f15d7/sensors-24-00479-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dda/10819357/26a02d4b5410/sensors-24-00479-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dda/10819357/bc0f77d3bced/sensors-24-00479-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dda/10819357/76a9cb700c0b/sensors-24-00479-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dda/10819357/529ebac782df/sensors-24-00479-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dda/10819357/a81eb293c03e/sensors-24-00479-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dda/10819357/3aed0c0ca59c/sensors-24-00479-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dda/10819357/1d7c447ff203/sensors-24-00479-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dda/10819357/ef46d9922981/sensors-24-00479-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dda/10819357/33caaad2e947/sensors-24-00479-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dda/10819357/dd58302f15d7/sensors-24-00479-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dda/10819357/26a02d4b5410/sensors-24-00479-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dda/10819357/bc0f77d3bced/sensors-24-00479-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dda/10819357/76a9cb700c0b/sensors-24-00479-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dda/10819357/529ebac782df/sensors-24-00479-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dda/10819357/a81eb293c03e/sensors-24-00479-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dda/10819357/3aed0c0ca59c/sensors-24-00479-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dda/10819357/1d7c447ff203/sensors-24-00479-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dda/10819357/ef46d9922981/sensors-24-00479-g010.jpg

相似文献

1
Methodology for the Detection of Contaminated Training Datasets for Machine Learning-Based Network Intrusion-Detection Systems.用于基于机器学习的网络入侵检测系统的受污染训练数据集检测方法。
Sensors (Basel). 2024 Jan 12;24(2):479. doi: 10.3390/s24020479.
2
A Novel Framework for Generating Personalized Network Datasets for NIDS Based on Traffic Aggregation.一种基于流量聚合的生成 NIDS 个性化网络数据集的新框架。
Sensors (Basel). 2022 Feb 26;22(5):1847. doi: 10.3390/s22051847.
3
Attack-Aware IoT Network Traffic Routing Leveraging Ensemble Learning.基于集成学习的攻击感知物联网网络流量路由
Sensors (Basel). 2021 Dec 29;22(1):241. doi: 10.3390/s22010241.
4
Network intrusion detection using oversampling technique and machine learning algorithms.使用过采样技术和机器学习算法的网络入侵检测
PeerJ Comput Sci. 2022 Jan 7;8:e820. doi: 10.7717/peerj-cs.820. eCollection 2022.
5
DReLAB - Deep REinforcement Learning Adversarial Botnet: A benchmark dataset for adversarial attacks against botnet Intrusion Detection Systems.DReLAB - 深度强化学习对抗僵尸网络:一个用于针对僵尸网络入侵检测系统进行对抗攻击的基准数据集。
Data Brief. 2020 Dec 8;34:106631. doi: 10.1016/j.dib.2020.106631. eCollection 2021 Feb.
6
A Deep Learning Ensemble for Network Anomaly and Cyber-Attack Detection.深度学习在网络异常和网络攻击检测中的应用。
Sensors (Basel). 2020 Aug 15;20(16):4583. doi: 10.3390/s20164583.
7
Hybrid rule-based botnet detection approach using machine learning for analysing DNS traffic.基于混合规则的僵尸网络检测方法:利用机器学习分析DNS流量
PeerJ Comput Sci. 2021 Aug 13;7:e640. doi: 10.7717/peerj-cs.640. eCollection 2021.
8
Adversarial attacks against supervised machine learning based network intrusion detection systems.对抗攻击对基于监督机器学习的网络入侵检测系统的影响。
PLoS One. 2022 Oct 14;17(10):e0275971. doi: 10.1371/journal.pone.0275971. eCollection 2022.
9
Botnet Detection and Mitigation Model for IoT Networks Using Federated Learning.基于联邦学习的物联网网络僵尸网络检测与缓解模型
Sensors (Basel). 2023 Jul 11;23(14):6305. doi: 10.3390/s23146305.
10
Evaluation of Machine Learning Techniques for Traffic Flow-Based Intrusion Detection.基于流量的入侵检测的机器学习技术评估。
Sensors (Basel). 2022 Nov 30;22(23):9326. doi: 10.3390/s22239326.

本文引用的文献

1
A Lightweight Unsupervised Intrusion Detection Model Based on Variational Auto-Encoder.基于变分自编码器的轻量级无监督入侵检测模型
Sensors (Basel). 2023 Oct 12;23(20):8407. doi: 10.3390/s23208407.
2
Intrusion Detection in IoT Using Deep Learning.物联网中的深度学习入侵检测。
Sensors (Basel). 2022 Nov 2;22(21):8417. doi: 10.3390/s22218417.
3
A New Intrusion Detection System for the Internet of Things via Deep Convolutional Neural Network and Feature Engineering.基于深度卷积神经网络和特征工程的物联网新型入侵检测系统。
Sensors (Basel). 2022 May 10;22(10):3607. doi: 10.3390/s22103607.
4
Enhanced Network Intrusion Detection System.增强型网络入侵检测系统。
Sensors (Basel). 2021 Nov 25;21(23):7835. doi: 10.3390/s21237835.
5
Inaccurate Labels in Weakly-Supervised Deep Learning: Automatic Identification and Correction and Their Impact on Classification Performance.弱监督深度学习中的不准确标签:自动识别和纠正及其对分类性能的影响。
IEEE J Biomed Health Inform. 2020 Sep;24(9):2701-2710. doi: 10.1109/JBHI.2020.2974425. Epub 2020 Feb 17.
6
Improving Crowdsourced Label Quality Using Noise Correction.利用噪声校正提高众包标签质量。
IEEE Trans Neural Netw Learn Syst. 2018 May;29(5):1675-1688. doi: 10.1109/TNNLS.2017.2677468. Epub 2017 Mar 22.
7
Optimal Thresholding of Classifiers to Maximize F1 Measure.分类器的最优阈值设定以最大化F1度量
Mach Learn Knowl Discov Databases. 2014;8725:225-239. doi: 10.1007/978-3-662-44851-9_15.