首页> 外文OA文献 >Mining Unstructured Log Messages for Security Threat Detection
【2h】

Mining Unstructured Log Messages for Security Threat Detection

机译:挖掘非结构化日志消息以进行安全威胁检测

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

As computers become larger, more powerful, and more connected, many challenges arise in implementing and maintaining a secure computing environment. Some of the challenges come from the exponential increase of unstructured messages generated by the computer systems and applications. Although these data contain a wealth of information that is useful for advanced threat detection and prediction for future anomalies, the sheer volume, variety, and complexity of data make it difficult for even well-trained analysts to extract the right information. While conventional SIEM (Security Information and Event Management) tools provide some capability to collect, correlate, and detect certain events from structured messages, their rule-based correlation and detection algorithms fall short in utilizing information in unstructured messages. This study explores the possibility of utilizing techniques for text mining, natural language processing, and machine learning to detect security threat by extracting relevant information from various unstructured log messages collected from distributed non-homogeneous systems. The extracted features are used to run a number of experiments on the Packet Clearing House SKAION 2006 IARPA Dataset, and the performance of prediction is evaluated. In comparison to the base case without feature extraction, an average of 16.73% of accumulated performance gain and 84% of time reduction was achieved using extracted features only, while a 23.48% performance gain with 82.39% of time increase was attained using both unstructured free-text messages and extracted features. The results display strong potential for further increase in performance by using larger size of training sets and extracting more features from the unstructured log messages.
机译:随着计算机变得越来越大,功能越来越强大,连接越来越紧密,在实现和维护安全的计算环境方面出现了许多挑战。一些挑战来自计算机系统和应用程序生成的非结构化消息的指数级增长。尽管这些数据包含大量信息,这些信息可用于高级威胁检测和对未来异常的预测,但庞大的数据量,多样性和复杂性使即使是训练有素的分析师也难以提取正确的信息。尽管常规的SIEM(安全信息和事件管理)工具提供了一些功能,可以从结构化消息中收集,关联和检测某些事件,但它们的基于规则的关联和检测算法在利用非结构化消息中的信息方面仍存在不足。这项研究探索了利用文本挖掘,自然语言处理和机器学习技术,通过从分布式非均匀系统收集的各种非结构化日志消息中提取相关信息来检测安全威胁的可能性。提取的特征用于在数据交换所SKAION 2006 IARPA数据集上进行大量实验,并评估了预测性能。与没有特征提取的基本情况相比,仅使用提取的特征,平均可实现16.73%的累积性能提升和84%的时间减少,而同时使用两种非结构化的免费结构,则可实现23.48%的性能提升和82.39%的时间增长-文本消息和提取的功能。结果表明,通过使用更大尺寸的训练集并从非结构化日志消息中提取更多功能,可以进一步提高性能。

著录项

  • 作者

    Suh-Lee Candace;

  • 作者单位
  • 年度 2016
  • 总页数
  • 原文格式 PDF
  • 正文语种 English
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号