Suppr超能文献

一种使用独热编码、合成少数过采样技术和机器学习技术的内部数据泄露检测方法。

An Insider Data Leakage Detection Using One-Hot Encoding, Synthetic Minority Oversampling and Machine Learning Techniques.

作者信息

Al-Shehari Taher, Alsowail Rakan A

机构信息

Computer Skills, Self-Development Department, Deanship of Common First Year, King Saud University, Riyadh 11362, Saudi Arabia.

出版信息

Entropy (Basel). 2021 Sep 27;23(10):1258. doi: 10.3390/e23101258.

Abstract

Insider threats are malicious acts that can be carried out by an authorized employee within an organization. Insider threats represent a major cybersecurity challenge for private and public organizations, as an insider attack can cause extensive damage to organization assets much more than external attacks. Most existing approaches in the field of insider threat focused on detecting general insider attack scenarios. However, insider attacks can be carried out in different ways, and the most dangerous one is a data leakage attack that can be executed by a malicious insider before his/her leaving an organization. This paper proposes a machine learning-based model for detecting such serious insider threat incidents. The proposed model addresses the possible bias of detection results that can occur due to an inappropriate encoding process by employing the feature scaling and one-hot encoding techniques. Furthermore, the imbalance issue of the utilized dataset is also addressed utilizing the synthetic minority oversampling technique (SMOTE). Well known machine learning algorithms are employed to detect the most accurate classifier that can detect data leakage events executed by malicious insiders during the sensitive period before they leave an organization. We provide a proof of concept for our model by applying it on CMU-CERT Insider Threat Dataset and comparing its performance with the ground truth. The experimental results show that our model detects insider data leakage events with an AUC-ROC value of 0.99, outperforming the existing approaches that are validated on the same dataset. The proposed model provides effective methods to address possible bias and class imbalance issues for the aim of devising an effective insider data leakage detection system.

摘要

内部威胁是指组织内授权员工可能实施的恶意行为。内部威胁对私营和公共组织来说是一个重大的网络安全挑战,因为内部攻击对组织资产造成的损害可能比外部攻击大得多。内部威胁领域的大多数现有方法都集中在检测一般的内部攻击场景。然而,内部攻击可以通过不同方式实施,最危险的一种是恶意内部人员在离开组织之前可能实施的数据泄露攻击。本文提出了一种基于机器学习的模型来检测此类严重的内部威胁事件。所提出的模型通过采用特征缩放和独热编码技术,解决了由于编码过程不当可能导致的检测结果偏差问题。此外,还利用合成少数类过采样技术(SMOTE)解决了所用数据集的不平衡问题。采用著名的机器学习算法来检测最准确的分类器,该分类器能够检测恶意内部人员在离开组织前的敏感时期内实施的数据泄露事件。我们通过将模型应用于CMU-CERT内部威胁数据集并将其性能与真实情况进行比较,为我们的模型提供了概念验证。实验结果表明,我们的模型检测内部数据泄露事件的AUC-ROC值为0.99,优于在同一数据集上验证的现有方法。所提出的模型提供了有效的方法来解决可能的偏差和类不平衡问题,旨在设计一个有效的内部数据泄露检测系统。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c3f/8535057/a6f0f6f02fb1/entropy-23-01258-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验