首页> 外文学位 >Learning from sequential data for anomaly detection.
【24h】

Learning from sequential data for anomaly detection.

机译:从顺序数据中学习以进行异常检测。

获取原文
获取原文并翻译 | 示例

摘要

Anomaly detection has been used in a wide range of real world problems and has received significant attention in a number of research fields over the last decades. Anomaly detection attempts to identify events, activities, or observations which are measurably different than an expected behavior or pattern present in a dataset. This thesis focuses on a specific set of techniques targeting the detection of anomalous behavior in a discrete, symbolic, and sequential dataset. Since profiling complex sequential data is still an open problem in anomaly detection, and given that the rate of production of sequential data in fields ranging from finance to homeland security is exploding, there is a pressing need to develop effective detection algorithms that can handle patterns in sequential information flows.;In this thesis, we address context-aware multi-class anomaly detection as applied to discrete sequences and develop a context learning approach using an unsupervised learning paradigm. We begin the anomaly detection process by applying our approach to differentiate normal behavior classes (contexts) before attempting to model normal behavior. This approach leads to stronger learning on each class by taking advantage of the power of advanced models to identify normal behavior of the sequence classes. We evaluate our discrete sequence-based anomaly detection framework using two illustrative applications: 1) System call intrusion detection and 2) Crowd anomaly detection. We also evaluate how clustering can guide our context-aware methodology to positively impact the anomaly detection rate.;In this thesis, we utilize a Hidden Markov Model (HMM) to perform anomaly detection. A HMM is the simplest dynamic Bayesian network. A HMM is a Markov model which can be used when the states are not observable, but observed data is dependent on these hidden states. While there has been a large amount of prior work utilizing Hidden Markov Models (HMMs) for anomaly detection, the proposed models became overly complex when attempting to improve the detection rate, while reducing the false detection rate.;We apply HMMs to perform anomaly detection on discrete sequential data. We utilize multiple HMMs, one for each context class. We demonstrate our multi-HMM approach to system call anomalies in cyber security and provide results in the presence of anomalies. Applying process trace analysis with multi-HMMs, system call anomaly detection achieves better results using better tuned model settings and a less complex structure to detect anomalies.;To evaluate the extensibility of our approach, we consider a second application, crowd behavior analytics. We attempt to classify crowd behavior and treat this as an anomaly detection problem on sequential data. We convert crowd video data into a discrete/symbolic sequence of data. We apply computer vision techniques to generate features from objects, and use these features for frame-based representations to model the behavior of the crowd in a video stream. We attempt to identify anomalous behavior of a crowd in a scene by applying machine learning techniques to understand what it means for a video stream to be identified as "normal". The results of applying our context-aware multi-HMMs approach to crowd analytics show the generality of our anomaly detection approach, and the power of our context-learning approach.
机译:异常检测已用于许多现实世界中的问题,并且在过去几十年中已在许多研究领域中引起了极大的关注。异常检测尝试识别事件,活动或观察结果,这些事件,活动或观察结果与数据集中存在的预期行为或模式明显不同。本文着重于针对离散,符号和顺序数据集中异常行为检测的一组特定技术。由于剖析复杂的顺序数据仍然是异常检测中的一个开放问题,并且考虑到从金融到国土安全等领域中顺序数据的生产率正在爆炸式增长,迫切需要开发一种能够处理模式中的模式的有效检测算法。顺序信息流。;在本文中,我们解决了应用于离散序列的上下文感知多类异常检测问题,并开发了一种使用无监督学习范式的上下文学习方法。在尝试对正常行为建模之前,我们通过应用我们的方法区分正常行为类(上下文)来开始异常检测过程。这种方法通过利用高级模型的功能来识别序列类的正常行为,从而使每个类的学习更加深入。我们使用两个示例性应用程序来评估基于离散序列的异常检测框架:1)系统调用入侵检测和2)人群异常检测。我们还评估了聚类如何指导我们的情境感知方法对异常检测率产生积极影响。在本文中,我们利用隐马尔可夫模型(HMM)进行异常检测。 HMM是最简单的动态贝叶斯网络。 HMM是马尔可夫模型,当状态无法观察到但观察到的数据取决于这些隐藏状态时可以使用。尽管已有大量的利用隐马尔可夫模型(HMM)进行异常检测的工作,但是当试图提高检测率同时降低错误检测率时,提出的模型变得过于复杂。在离散的顺序数据上。我们利用多个HMM,每个上下文类一个。我们演示了针对网络安全中系统调用异常的多HMM方法,并在出现异常的情况下提供了结果。应用具有多个HMM的过程跟踪分析,系统调用异常检测可以使用更好的调整模型设置和更简单的结构来检测异常,从而获得更好的结果。我们尝试对人群行为进行分类,并将其作为对顺序数据的异常检测问题。我们将人群视频数据转换为离散/符号序列的数据。我们应用计算机视觉技术从对象生成特征,并将这些特征用于基于帧的表示,以对视频流中人群的行为进行建模。我们试图通过应用机器学习技术来了解场景中人群的异常行为,以了解将视频流识别为“正常”意味着什么。将我们的上下文感知多HMM方法应用于人群分析的结果表明,我们的异常检测方法具有普遍性,并且上下文学习方法具有强大的功能。

著录项

  • 作者

    Yolacan, Esra Nergis.;

  • 作者单位

    Northeastern University.;

  • 授予单位 Northeastern University.;
  • 学科 Computer engineering.;Computer science.
  • 学位 Ph.D.
  • 年度 2014
  • 页码 154 p.
  • 总页数 154
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号