Al-Mashhadi Saif, Anbar Mohammed, Hasbullah Iznan, Alamiedy Taief Alaa
National Advanced IPv6 Centre, Universiti Sains Malaysia, Penang, Malaysia.
Electrical Engineering, University of Baghdad, Baghdad, Baghdad, Iraq.
PeerJ Comput Sci. 2021 Aug 13;7:e640. doi: 10.7717/peerj-cs.640. eCollection 2021.
Botnets can simultaneously control millions of Internet-connected devices to launch damaging cyber-attacks that pose significant threats to the Internet. In a botnet, bot-masters communicate with the command and control server using various communication protocols. One of the widely used communication protocols is the 'Domain Name System' (DNS) service, an essential Internet service. Bot-masters utilise Domain Generation Algorithms (DGA) and fast-flux techniques to avoid static blacklists and reverse engineering while remaining flexible. However, botnet's DNS communication generates anomalous DNS traffic throughout the botnet life cycle, and such anomaly is considered an indicator of DNS-based botnets presence in the network. Despite several approaches proposed to detect botnets based on DNS traffic analysis; however, the problem still exists and is challenging due to several reasons, such as not considering significant features and rules that contribute to the detection of DNS-based botnet. Therefore, this paper examines the abnormality of DNS traffic during the botnet lifecycle to extract significant enriched features. These features are further analysed using two machine learning algorithms. The union of the output of two algorithms proposes a novel hybrid rule detection model approach. Two benchmark datasets are used to evaluate the performance of the proposed approach in terms of detection accuracy and false-positive rate. The experimental results show that the proposed approach has a 99.96% accuracy and a 1.6% false-positive rate, outperforming other state-of-the-art DNS-based botnet detection approaches.
僵尸网络可以同时控制数百万个联网设备,发起具有破坏性的网络攻击,对互联网构成重大威胁。在僵尸网络中,僵尸网络主控者使用各种通信协议与命令和控制服务器进行通信。广泛使用的通信协议之一是“域名系统”(DNS)服务,这是一项重要的互联网服务。僵尸网络主控者利用域名生成算法(DGA)和快速通量技术来避免静态黑名单和逆向工程,同时保持灵活性。然而,僵尸网络的DNS通信在整个僵尸网络生命周期中会产生异常的DNS流量,这种异常被视为网络中基于DNS的僵尸网络存在的一个指标。尽管已经提出了几种基于DNS流量分析来检测僵尸网络的方法;然而,由于一些原因,比如没有考虑有助于检测基于DNS的僵尸网络的重要特征和规则,这个问题仍然存在且具有挑战性。因此,本文研究了僵尸网络生命周期中DNS流量的异常情况,以提取重要的丰富特征。使用两种机器学习算法对这些特征进行进一步分析。两种算法输出的结合提出了一种新颖的混合规则检测模型方法。使用两个基准数据集从检测准确率和误报率方面评估所提方法的性能。实验结果表明,所提方法的准确率为99.96%,误报率为1.6%,优于其他基于DNS的僵尸网络检测的现有方法。