• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

CLDTLog:基于对比学习和双重目标任务的系统日志异常检测方法。

CLDTLog: System Log Anomaly Detection Method Based on Contrastive Learning and Dual Objective Tasks.

机构信息

School of Software, Xinjiang University, Urumqi 830046, China.

College of Information Science and Engineering, Xinjiang University, Urumqi 830046, China.

出版信息

Sensors (Basel). 2023 May 24;23(11):5042. doi: 10.3390/s23115042.

DOI:10.3390/s23115042
PMID:37299767
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10255444/
Abstract

System logs are a crucial component of system maintainability, as they record the status of the system and essential events for troubleshooting and maintenance when necessary. Therefore, anomaly detection of system logs is crucial. Recent research has focused on extracting semantic information from unstructured log messages for log anomaly detection tasks. Since BERT models work well in natural language processing, this paper proposes an approach called CLDTLog, which introduces contrastive learning and dual-objective tasks in a BERT pre-trained model and performs anomaly detection on system logs through a fully connected layer. This approach does not require log parsing and thus can avoid the uncertainty caused by log parsing. We trained the CLDTLog model on two log datasets (HDFS and BGL) and achieved F1 scores of 0.9971 and 0.9999 on the HDFS and BGL datasets, respectively, which performed better than all known methods. In addition, when using only 1% of the BGL dataset as training data, CLDTLog still achieves an F1 score of 0.9993, showing excellent generalization performance with a significant reduction of the training cost.

摘要

系统日志是系统可维护性的重要组成部分,因为它们记录了系统的状态和必要事件,以便在需要时进行故障排除和维护。因此,对系统日志进行异常检测至关重要。最近的研究集中在从非结构化日志消息中提取语义信息,以用于日志异常检测任务。由于 BERT 模型在自然语言处理方面表现出色,因此本文提出了一种名为 CLDTLog 的方法,该方法在 BERT 预训练模型中引入了对比学习和双目标任务,并通过全连接层对系统日志进行异常检测。该方法不需要日志解析,因此可以避免日志解析带来的不确定性。我们在两个日志数据集(HDFS 和 BGL)上训练了 CLDTLog 模型,在 HDFS 和 BGL 数据集上的 F1 得分分别为 0.9971 和 0.9999,优于所有已知方法。此外,当仅使用 BGL 数据集的 1%作为训练数据时,CLDTLog 仍然可以达到 0.9993 的 F1 得分,表现出出色的泛化性能,同时大大降低了训练成本。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a67/10255444/908716f312a9/sensors-23-05042-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a67/10255444/3e809a256f5d/sensors-23-05042-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a67/10255444/c483fd0cb3ce/sensors-23-05042-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a67/10255444/266abc614ab8/sensors-23-05042-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a67/10255444/e632083a3aa7/sensors-23-05042-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a67/10255444/5b1257ed3b6d/sensors-23-05042-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a67/10255444/ee77f832912a/sensors-23-05042-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a67/10255444/6a5c0d381dc3/sensors-23-05042-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a67/10255444/deffba87c329/sensors-23-05042-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a67/10255444/908716f312a9/sensors-23-05042-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a67/10255444/3e809a256f5d/sensors-23-05042-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a67/10255444/c483fd0cb3ce/sensors-23-05042-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a67/10255444/266abc614ab8/sensors-23-05042-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a67/10255444/e632083a3aa7/sensors-23-05042-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a67/10255444/5b1257ed3b6d/sensors-23-05042-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a67/10255444/ee77f832912a/sensors-23-05042-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a67/10255444/6a5c0d381dc3/sensors-23-05042-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a67/10255444/deffba87c329/sensors-23-05042-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a67/10255444/908716f312a9/sensors-23-05042-g009.jpg

相似文献

1
CLDTLog: System Log Anomaly Detection Method Based on Contrastive Learning and Dual Objective Tasks.CLDTLog:基于对比学习和双重目标任务的系统日志异常检测方法。
Sensors (Basel). 2023 May 24;23(11):5042. doi: 10.3390/s23115042.
2
Log Sequence Anomaly Detection Method Based on Contrastive Adversarial Training and Dual Feature Extraction.基于对比对抗训练和双特征提取的日志序列异常检测方法
Entropy (Basel). 2021 Dec 30;24(1):69. doi: 10.3390/e24010069.
3
Impact of log parsing on deep learning-based anomaly detection.日志解析对基于深度学习的异常检测的影响。
Empir Softw Eng. 2024;29(6):139. doi: 10.1007/s10664-024-10533-w. Epub 2024 Aug 17.
4
ConAnomaly: Content-Based Anomaly Detection for System Logs.ConAnomaly:基于内容的系统日志异常检测。
Sensors (Basel). 2021 Sep 13;21(18):6125. doi: 10.3390/s21186125.
5
Fine-tuning BERT for automatic ADME semantic labeling in FDA drug labeling to enhance product-specific guidance assessment.在FDA药品标签中微调BERT以进行自动ADME语义标注,以加强特定产品的指导评估。
J Biomed Inform. 2023 Feb;138:104285. doi: 10.1016/j.jbi.2023.104285. Epub 2023 Jan 9.
6
Extracting comprehensive clinical information for breast cancer using deep learning methods.利用深度学习方法提取乳腺癌全面临床信息。
Int J Med Inform. 2019 Dec;132:103985. doi: 10.1016/j.ijmedinf.2019.103985. Epub 2019 Oct 2.
7
DualAttlog: Context aware dual attention networks for log-based anomaly detection.DualAttlog:基于日志的异常检测的上下文感知双注意网络。
Neural Netw. 2024 Dec;180:106680. doi: 10.1016/j.neunet.2024.106680. Epub 2024 Aug 31.
8
Two Class Pruned Log Message Anomaly Detection.两类剪枝日志消息异常检测
SN Comput Sci. 2021;2(5):391. doi: 10.1007/s42979-021-00772-9. Epub 2021 Jul 24.
9
Multi-Label Classification in Patient-Doctor Dialogues With the RoBERTa-WWM-ext + CNN (Robustly Optimized Bidirectional Encoder Representations From Transformers Pretraining Approach With Whole Word Masking Extended Combining a Convolutional Neural Network) Model: Named Entity Study.基于RoBERTa-WWM-ext + CNN(带有全词掩码扩展的基于变换器预训练方法的稳健优化双向编码器表示与卷积神经网络相结合)模型的医患对话多标签分类:命名实体研究
JMIR Med Inform. 2022 Apr 21;10(4):e35606. doi: 10.2196/35606.
10
Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing.利用基于深度学习的自然语言处理技术从非结构化电子健康记录中分类社会健康决定因素。
J Biomed Inform. 2022 Mar;127:103984. doi: 10.1016/j.jbi.2021.103984. Epub 2022 Jan 7.

本文引用的文献

1
Log Sequence Anomaly Detection Method Based on Contrastive Adversarial Training and Dual Feature Extraction.基于对比对抗训练和双特征提取的日志序列异常检测方法
Entropy (Basel). 2021 Dec 30;24(1):69. doi: 10.3390/e24010069.
2
Variance-based global sensitivity analysis for rear-end crash investigation using deep learning.基于方差的深度学习方法在追尾事故调查中的全局敏感性分析。
Accid Anal Prev. 2022 Feb;165:106514. doi: 10.1016/j.aap.2021.106514. Epub 2021 Dec 8.