College of Computer, National University of Defense Technology, Changsha 410073, China.
School of Cyberspace Security, Hangzhou Dianzi University, Hangzhou 310018, China.
Sensors (Basel). 2019 Feb 10;19(3):716. doi: 10.3390/s19030716.
Protocol Reverse Engineering (PRE) is crucial for information security of Internet-of-Things (IoT), and message clustering determines the effectiveness of PRE. However, the quality of services still lags behind the strict requirement of IoT applications as the results of message clustering are often coarse-grained with the intrinsic type information hidden in messages largely ignored. Aiming at this problem, this study proposes a type-aware approach to message clustering guided by type information. The approach regards a message as a combination of n-grams, and it employs the Latent Dirichlet Allocation (LDA) model to characterize messages with types and n-grams via inferring the type distribution of each message. The type distribution is finally used to measure the similarity of messages. According to this similarity, the approach clusters messages and further extracts message formats. Experimental results of the approach against Netzob in terms of a number of protocols indicate that the correctness and conciseness can be significantly improved, e.g., figures 43.86% and 3.87%, respectively for the CoAP protocol.
协议逆向工程(PRE)对于物联网(IoT)的信息安全至关重要,而消息聚类决定了 PRE 的有效性。然而,服务质量仍然落后于物联网应用的严格要求,因为消息聚类的结果通常是粗粒度的,消息中隐藏的固有类型信息在很大程度上被忽略了。针对这个问题,本研究提出了一种基于类型信息指导的消息聚类的方法。该方法将消息视为 n-gram 的组合,并通过推断每个消息的类型分布,使用潜在狄利克雷分配(LDA)模型来对带有类型和 n-gram 的消息进行特征化。最后,类型分布用于衡量消息的相似性。根据这种相似性,该方法对消息进行聚类,并进一步提取消息格式。该方法在 Netzob 上针对多个协议进行的实验结果表明,正确性和简洁性可以得到显著提高,例如,CoAP 协议的正确率提高了 43.86%,简洁性提高了 3.87%。