Vu Ly, Cao Van Loi, Nguyen Quang Uy, Nguyen Diep N, Hoang Dinh Thai, Dutkiewicz Eryk
IEEE Trans Cybern. 2022 May;52(5):3769-3782. doi: 10.1109/TCYB.2020.3013416. Epub 2022 May 19.
Internet of Things (IoT) has emerged as a cutting-edge technology that is changing human life. The rapid and widespread applications of IoT, however, make cyberspace more vulnerable, especially to IoT-based attacks in which IoT devices are used to launch attack on cyber-physical systems. Given a massive number of IoT devices (in order of billions), detecting and preventing these IoT-based attacks are critical. However, this task is very challenging due to the limited energy and computing capabilities of IoT devices and the continuous and fast evolution of attackers. Among IoT-based attacks, unknown ones are far more devastating as these attacks could surpass most of the current security systems and it takes time to detect them and "cure" the systems. To effectively detect new/unknown attacks, in this article, we propose a novel representation learning method to better predictively "describe" unknown attacks, facilitating supervised learning-based anomaly detection methods. Specifically, we develop three regularized versions of autoencoders (AEs) to learn a latent representation from the input data. The bottleneck layers of these regularized AEs trained in a supervised manner using normal data and known IoT attacks will then be used as the new input features for classification algorithms. We carry out extensive experiments on nine recent IoT datasets to evaluate the performance of the proposed models. The experimental results demonstrate that the new latent representation can significantly enhance the performance of supervised learning methods in detecting unknown IoT attacks. We also conduct experiments to investigate the characteristics of the proposed models and the influence of hyperparameters on their performance. The running time of these models is about 1.3 ms that is pragmatic for most applications.
物联网(IoT)已成为一项正在改变人类生活的前沿技术。然而,物联网的快速广泛应用使网络空间变得更加脆弱,尤其是容易受到基于物联网的攻击,在这类攻击中,物联网设备被用于对网络物理系统发动攻击。鉴于存在大量的物联网设备(以数十亿计),检测和防范这些基于物联网的攻击至关重要。然而,由于物联网设备的能量和计算能力有限,以及攻击者的持续快速演变,这项任务极具挑战性。在基于物联网的攻击中,未知攻击的破坏性要大得多,因为这些攻击可能会突破大多数当前的安全系统,而且检测它们并“修复”系统需要时间。为了有效检测新的/未知攻击,在本文中,我们提出一种新颖的表示学习方法,以便更好地预测性地“描述”未知攻击,从而促进基于监督学习的异常检测方法。具体而言,我们开发了三种正则化版本的自动编码器(AE),用于从输入数据中学习潜在表示。然后,使用正常数据和已知的物联网攻击以监督方式训练的这些正则化自动编码器的瓶颈层将用作分类算法的新输入特征。我们在九个最新的物联网数据集上进行了广泛的实验,以评估所提出模型的性能。实验结果表明,新的潜在表示可以显著提高监督学习方法在检测未知物联网攻击方面的性能。我们还进行了实验,以研究所提出模型的特性以及超参数对其性能的影响。这些模型的运行时间约为1.3毫秒,这对大多数应用来说是切实可行的。