Institute of Information Engineering, Chinese Academy of Sciences, Beijing, 100080, China; School of Cyber Security, University of Chinese Academy of Sciences, Beijing, 100080, China.
School of Cyber Security, University of Chinese Academy of Sciences, Beijing, 100080, China.
Neural Netw. 2024 Dec;180:106680. doi: 10.1016/j.neunet.2024.106680. Epub 2024 Aug 31.
Most existing log-driven anomaly detection methods assume that logs are static and unchanged, which is often impractical. To address this, we propose a log anomaly detection model called DualAttlog. This model includes word-level and sequence-level semantic encoding modules, as well as a context-aware dual attention module. Specifically, The word-level semantic encoding module utilizes a self-matching attention mechanism to explore the interactive properties between words in log sequences. By performing word embedding and semantic encoding, it captures the associations and evolution processes between words, extracting local-level semantic information. while The sequence-level semantic encoding module encoding the entire log sequence using a pre-trained model. This extracts global semantic information, capturing overall patterns and trends in the logs. The context-aware dual attention module integrates these two levels of encoding, utilizing contextual information to reduce redundancy and enhance detection accuracy. Experimental results show that the DualAttlog model achieves an F1-Score of over 95% on 7 public datasets. Impressively, it achieves an F1-Score of 82.35% on the Real-Industrial W dataset and 83.54% on the Real-Industrial Q dataset. It outperforms existing baseline techniques on 9 datasets, demonstrating its significant advantages.
大多数现有的基于日志的异常检测方法都假设日志是静态且不变的,但这在实际中往往不切实际。针对这一问题,我们提出了一种名为 DualAttlog 的日志异常检测模型。该模型包括词级和序列级语义编码模块以及上下文感知的双重注意力模块。具体来说,词级语义编码模块利用自匹配注意力机制来探索日志序列中单词之间的交互属性。通过进行单词嵌入和语义编码,它捕捉了单词之间的关联和演变过程,提取了局部级别的语义信息。而序列级语义编码模块则使用预训练的模型对整个日志序列进行编码。这提取了日志的全局语义信息,捕捉了日志中的整体模式和趋势。上下文感知的双重注意力模块整合了这两个编码层次,利用上下文信息来减少冗余并提高检测准确性。实验结果表明,DualAttlog 模型在 7 个公共数据集上的 F1-Score 超过 95%。令人印象深刻的是,它在 Real-Industrial W 数据集上的 F1-Score 达到了 82.35%,在 Real-Industrial Q 数据集上的 F1-Score 达到了 83.54%。它在 9 个数据集上优于现有的基线技术,显示出其显著的优势。