两类剪枝日志消息异常检测

Two Class Pruned Log Message Anomaly Detection.

作者信息

Farzad Amir, Gulliver T Aaron

机构信息

Department of Electrical and Computer Engineering, University of Victoria, PO Box 1700, STN CSC, Victoria, BC V8W 2Y2 Canada.

出版信息

SN Comput Sci. 2021;2(5):391. doi: 10.1007/s42979-021-00772-9. Epub 2021 Jul 24.

DOI:10.1007/s42979-021-00772-9

PMID:34337434

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8310418/

Abstract

Log messages are widely used in cloud servers and other systems. Millions of logs are generated each day which makes them important for anomaly detection. However, they are complex unstructured text messages which makes this task difficult. In this paper, a hybrid log message anomaly detection technique is proposed which employs pruning of positive and negative logs. Reliable positive log messages are first selected using a Gaussian mixture model algorithm. Then reliable negative logs are selected using the K-means, Gaussian mixture model and Dirichlet process Gaussian mixture model methods iteratively. It is shown that the precision for positive and negative logs with pruning is high. Anomaly detection is done using a deep learning long short-term memory network. The proposed model is evaluated using the well-known BGL, Openstack, and Thunderbird data sets. The results obtained indicate that the proposed model performs better than several well-known algorithms.

摘要

日志消息在云服务器和其他系统中被广泛使用。每天都会生成数百万条日志，这使得它们对于异常检测很重要。然而，它们是复杂的非结构化文本消息，这使得这项任务变得困难。本文提出了一种混合日志消息异常检测技术，该技术采用了对正、负日志的修剪。首先使用高斯混合模型算法选择可靠的正日志消息。然后，使用K均值、高斯混合模型和狄利克雷过程高斯混合模型方法迭代地选择可靠的负日志。结果表明，经过修剪的正、负日志的精度很高。使用深度学习长短期记忆网络进行异常检测。使用著名的BGL、Openstack和Thunderbird数据集对所提出的模型进行评估。获得的结果表明，所提出的模型比几种著名算法表现更好。