首页> 外文会议>Pacific-Asia conference on knowledge discovery and data mining >Real-Time Change-Point Detection Using Sequentially Discounting Normalized Maximum Likelihood Coding
【24h】

Real-Time Change-Point Detection Using Sequentially Discounting Normalized Maximum Likelihood Coding

机译:使用顺序折扣归一化最大似然编码的实时变点检测

获取原文

摘要

We are concerned with the issue of real-time change-point detection in time series. This technology has recently received vast attentions in the area of data mining since it can be applied to a wide variety of important risk management issues such as the detection of failures of computer devices from computer performance data, the detection of masquer-aders/malicious executables from computer access logs, etc. In this paper we propose a new method of real-time change point detection employing the sequentially discounting normalized maximum likelihood coding (SD-NML). Here the SDNML is a method for sequential data compression of a sequence, which we newly develop in this paper. It attains the least code length for the sequence and the effect of past data is gradually discounted as time goes on, hence the data compression can be done adaptively to non-stationary data sources. In our method, the SDNML is used to learn the mechanism of a time series, then a change-point score at each time is measured in terms of the SDNML code-length. We empirically demonstrate the significant superiority of our method over existing methods, such as the predictive-coding method and the hypothesis testing method, in terms of detection accuracy and computational efficiency for artificial data sets. We further apply our method into real security issues called malware detection. We empirically demonstrate that our method is able to detect unseen security incidents at significantly early stages.
机译:我们涉及时间序列中实时变化点检测的问题。该技术最近收到了数据挖掘领域的巨大关注,因为它可以应用于各种重要的风险管理问题,例如从计算机性能数据检测计算机设备的故障,检测粉丝涂布/恶意可执行文件从计算机访问日志等。在本文中,我们提出了一种新的实时变化点检测方法,采用序列折扣归一化最大似然编码(SD-NML)。这里,SDNML是一种顺序数据压缩的方法,我们在本文中新开发。它达到序列的最小代码长度,随着时间的推移,过去数据的效果逐渐折扣,因此数据压缩可以自适应地完成非静止数据源。在我们的方法中,SDNML用于学习时间序列的机制,然后根据SDNML代码长度测量每次的变化点分数。在检测准确性和人工数据集的计算效率方面,我们经验证明了我们对现有方法的显着优越性,例如预测编码方法和假设检测方法。我们进一步将方法应用于称为恶意软件检测的实际安全问题。我们经验证明我们的方法能够在显着早期阶段检测看不见的安全事件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号