Musthafa Muhammad Bisri, Huda Samsul, Kodera Yuta, Ali Md Arshad, Araki Shunsuke, Mwaura Jedidah, Nogami Yasuyuki
Graduate School of Environmental, Life, Natural Science and Technology, Okayama University, Okayama 700-8530, Japan.
Green Innovation Center, Okayama University, Okayama 700-8530, Japan.
Sensors (Basel). 2024 Jul 1;24(13):4293. doi: 10.3390/s24134293.
Internet of Things (IoT) devices are leading to advancements in innovation, efficiency, and sustainability across various industries. However, as the number of connected IoT devices increases, the risk of intrusion becomes a major concern in IoT security. To prevent intrusions, it is crucial to implement intrusion detection systems (IDSs) that can detect and prevent such attacks. IDSs are a critical component of cybersecurity infrastructure. They are designed to detect and respond to malicious activities within a network or system. Traditional IDS methods rely on predefined signatures or rules to identify known threats, but these techniques may struggle to detect novel or sophisticated attacks. The implementation of IDSs with machine learning (ML) and deep learning (DL) techniques has been proposed to improve IDSs' ability to detect attacks. This will enhance overall cybersecurity posture and resilience. However, ML and DL techniques face several issues that may impact the models' performance and effectiveness, such as overfitting and the effects of unimportant features on finding meaningful patterns. To ensure better performance and reliability of machine learning models in IDSs when dealing with new and unseen threats, the models need to be optimized. This can be done by addressing overfitting and implementing feature selection. In this paper, we propose a scheme to optimize IoT intrusion detection by using class balancing and feature selection for preprocessing. We evaluated the experiment on the UNSW-NB15 dataset and the NSL-KD dataset by implementing two different ensemble models: one using a support vector machine (SVM) with bagging and another using long short-term memory (LSTM) with stacking. The results of the performance and the confusion matrix show that the LSTM stacking with analysis of variance (ANOVA) feature selection model is a superior model for classifying network attacks. It has remarkable accuracies of 96.92% and 99.77% and overfitting values of 0.33% and 0.04% on the two datasets, respectively. The model's ROC is also shaped with a sharp bend, with AUC values of 0.9665 and 0.9971 for the UNSW-NB15 dataset and the NSL-KD dataset, respectively.
物联网(IoT)设备正在推动各个行业在创新、效率和可持续性方面取得进步。然而,随着连接的物联网设备数量增加,入侵风险成为物联网安全中的一个主要问题。为了防止入侵,实施能够检测和预防此类攻击的入侵检测系统(IDS)至关重要。IDS是网络安全基础设施的关键组成部分。它们旨在检测并响应网络或系统内的恶意活动。传统的IDS方法依靠预定义的签名或规则来识别已知威胁,但这些技术可能难以检测到新颖或复杂的攻击。有人提出采用机器学习(ML)和深度学习(DL)技术来实施IDS,以提高IDS检测攻击的能力。这将增强整体网络安全态势和恢复能力。然而,ML和DL技术面临一些可能影响模型性能和有效性的问题,例如过拟合以及无关特征对寻找有意义模式的影响。为了确保在处理新的和未知威胁时,IDS中机器学习模型具有更好的性能和可靠性,需要对模型进行优化。这可以通过解决过拟合问题和实施特征选择来实现。在本文中,我们提出了一种通过使用类平衡和特征选择进行预处理来优化物联网入侵检测的方案。我们通过实施两种不同的集成模型,在UNSW-NB15数据集和NSL-KD数据集上进行了实验评估:一种是使用带有装袋法的支持向量机(SVM),另一种是使用带有堆叠法的长短期记忆(LSTM)。性能结果和混淆矩阵表明,采用方差分析(ANOVA)特征选择模型的LSTM堆叠是用于分类网络攻击的卓越模型。在两个数据集上,它分别具有96.92%和99.77%的显著准确率以及0.33%和0.04%的过拟合值。该模型的ROC曲线也呈明显的弯曲形状,在UNSW-NB15数据集和NSL-KD数据集上的AUC值分别为0.9665和0.9971。