...
首页> 外文期刊>Advanced Science Letters >ADSTREAM: Anomaly Detection in Large-Scale Data Streams Using Local Outlier Factor Based on Micro-Cluster
【24h】

ADSTREAM: Anomaly Detection in Large-Scale Data Streams Using Local Outlier Factor Based on Micro-Cluster

机译:adstream:使用基于微簇的本地异常因素的大规模数据流中的异常检测

获取原文
获取原文并翻译 | 示例
           

摘要

Micro-cluster based clustering methods perform efficient clustering for the large-scale data stream by using two components which are online phase and offline phase. An online component creates micro-clusters for input data stream and an offline component performs final clustering basedon a formed micro-cluster from online component. However, since these methods are passive for anomaly detection, there are disadvantages in that outliers are not specified. Most existing methodologies first cluster all data and then set the data not clustered as outliers. Although the typicalmicro-cluster based data stream clustering methods are excellent in clustering quality, these methodologies are not suitable for anomaly detection which should clarify what data is outliers. In this paper, we propose ADSTREAM using a Local Outlier Factor for center of micro-clusters in theoffline component for detecting and specifying outliers. In the experiment, we visualize the anomaly detection results of ADSTREAM and perform micro-cluster based anomaly detections on the large-scale streams of the KDD-CUP1999 dataset and show that the performance of anomaly detection performedby ADSTREAM is improved dramatically compared to the existing micro-cluster based clustering methods. As a result, ADSTREAM is able to efficiently perform anomaly detection while preserving the advantages of existing data stream clustering algorithms for real-time large-scale streams.
机译:基于微簇的聚类方法通过使用在线相位和离线阶段的两个组件对大规模数据流进行有效聚类。在线组件为输入数据流创建微群,脱机组件执行来自在线组件的形成的微集群的最终聚类。但是,由于这些方法是用于异常检测的被动,因此未指定异常值的缺点。最现有的方法介绍所有数据,然后将未群集为异常值的数据设置。虽然基于典型的基于群集的数据流聚类方法在聚类质量方面具有优异的,但这些方法不适用于异常检测,这应该澄清数据是异常值的。在本文中,我们在Offline组件中使用本地异常因素来提出ADSTREAM用于检测和指定异常值的微集群中心。在实验中,我们在KDD-CUP1999数据集的大规模流上显示了ADSTREAM的异常检测结果并对基于KDD-Cup1999数据集的大规模流进行了分析,并显示了与现有的微量的ADSTREAM的异常检测的性能发生了显着提高-Cluster基于聚类方法。结果,ADStream能够有效地执行异常检测,同时保留现有数据流聚类算法的实时大规模流的优点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号