首页> 外文会议>IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems >Improving Performances of Log Mining for Anomaly Prediction through NLP-based Log Parsing
【24h】

Improving Performances of Log Mining for Anomaly Prediction through NLP-based Log Parsing

机译:通过基于NLP的日志解析改进对异常预测的日志挖掘的性能

获取原文

摘要

Failure prediction of industrial systems is a promising application domain for data mining approaches and should naturally rely on log messages which are a prime source of data as they are generated by many systems. However, before extracting relevant information of such log messages, another critical step is to parse the logs, that is to say to transform a raw unstructured text from the log messages into a suitable input for data mining. These two problems (log parsing then log mining) are often studied separately while they are directly related in the context of failure prediction ; moreover, few performance benchmarks are publicly available. In this paper, we focus on the impact of log parsing techniques via natural language processing on the performances of log mining on two datasets. The first one is a log of an industrial aeronautical system comprising over 4,500,000 messages collected over one year of operation ; the second one is a public benchmark set from an HDFS cluster. On the latter, we show that it is possible to raise the F-score from 96% to 99.2% while using simpler and more robust log parsing techniques that require less parameter tuning provided that they are correctly combined with log mining techniques.
机译:工业系统的失败预测是数据挖掘方法的有希望的应用域,并且应该自然地依赖于许多系统生成的主要数据源的日志消息。然而,这样的提取的日志信息的相关信息之前,另一个关键步骤是解析日志,也就是说,从日志消息原料非结构化的文本转换成用于数据挖掘的合适的输入。这两个问题(日志解析然后记录挖掘)通常在故障预测的背景下直接相关;此外,很少有效基准是公开的。在本文中,我们专注于通过对两个数据集上的日志挖掘性能的自然语言处理对日志解析技术的影响。第一个是工业航空系统的日志,包括在一年内收集超过4,500,000条消息;第二个是从HDFS集群中设置的公共基准。在后者,我们表明,使用更简单和更强大的日志解析技术,可以将F分数从96%提高到99.2%,这些技术需要较少的参数调整,所以它们与日志挖掘技术正确地结合使用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号