【24h】

Tools and Benchmarks for Automated Log Parsing

机译:自动日志解析的工具和基准

获取原文

摘要

Logs are imperative in the development and maintenance process of many software systems. They record detailed runtime information that allows developers and support engineers to monitor their systems and dissect anomalous behaviors and errors. The increasing scale and complexity of modern software systems, however, make the volume of logs explodes. In many cases, the traditional way of manual log inspection becomes impractical. Many recent studies, as well as industrial tools, resort to powerful text search and machine learning-based analytics solutions. Due to the unstructured nature of logs, a first crucial step is to parse log messages into structured data for subsequent analysis. In recent years, automated log parsing has been widely studied in both academia and industry, producing a series of log parsers by different techniques. To better understand the characteristics of these log parsers, in this paper, we present a comprehensive evaluation study on automated log parsing and further release the tools and benchmarks for easy reuse. More specifically, we evaluate 13 log parsers on a total of 16 log datasets spanning distributed systems, supercomputers, operating systems, mobile systems, server applications, and standalone software. We report the benchmarking results in terms of accuracy, robustness, and efficiency, which are of practical importance when deploying automated log parsing in production. We also share the success stories and lessons learned in an industrial application at Huawei. We believe that our work could serve as the basis and provide valuable guidance to future research and deployment of automated log parsing.
机译:日志在许多软件系统的开发和维护过程中是必不可由的。它们记录了详细的运行时信息,允许开发人员和支持工程师监控其系统并剖析异常行为和错误。然而,现代软件系统的规模和复杂性越来越大,使日志数量爆炸。在许多情况下,传统的手动日志检查方式变得不切实际。许多最近的研究,以及工业工具,诉诸强大的文本搜索和基于机器学习的分析解决方案。由于日志的非结构化性质,第一个关键步骤是将日志消息解析为结构化数据以进行后续分析。近年来,在学术界和工业中,自动日志解析已被广泛研究,由不同的技术生产一系列日志解析器。为了更好地了解这些日志解析器的特征,在本文中,我们对自动日志解析提供了全面的评估研究,并进一步发布了工具和基准,便于重用。更具体地说,我们在共16个日志数据集中评估了13个日志解析器,该数据集遍布分布式系统,超级计算机,操作系统,移动系统,服务器应用程序和独立软件。我们在准确性,稳健性和效率方面报告基准导致,这在部署生产中的自动日志解析时具有实际重要性。我们还分享了华为在工业应用中吸取的成功案例和经验教训。我们认为,我们的工作可以作为基础,为未来的自动化日志解析的研究和部署提供有价值的指导。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号