首页> 外文会议>Proceedings of the IASTED international conferences on informatics >INCREMENTAL METHODS FOR DETECTING OUTLIERS FROM MULTIVARIATE DATA STREAM
【24h】

INCREMENTAL METHODS FOR DETECTING OUTLIERS FROM MULTIVARIATE DATA STREAM

机译:从多元数据流中检测边缘的增量方法

获取原文
获取原文并翻译 | 示例

摘要

Outlier detection is one of the most important data mining techniques. It has broad applications like fraud detection, credit approval, computer network intrusion detection, anti-money laundering, etc. The basis of outlier detection is to identify data points which are "different" or "far away" from the rest of the data points in the given dataset. Traditional outlier detection method is based on statistical analysis. However, this traditional method has an inherent drawback-it requires the availability of the entire dataset. In practice, especially in the real time data feed application, it is not so realistic to wait for all the data because fresh data are streaming in very quickly. Outlier detection is hence done in batches. However two drawbacks may arise: relatively long processing time because of the massive size, and the result may be outdated soon between successive updates. In this paper, we propose several novel incremental methods to process the real time data effectively for outlier detection. For the experiment, we test three types of mechanisms for analyzing the dataset, namely Global Analysis, Cumulative Analysis and Lightweight Analysis with Sliding Window. The experiment dataset is "household power consumption" which is a popular benchmarking data for Massive Online Analysis.
机译:离群值检测是最重要的数据挖掘技术之一。它具有广泛的应用程序,例如欺诈检测,信用审批,计算机网络入侵检测,反洗钱等。异常检测的基础是识别与其余数据点“不同”或“相距较远”的数据点在给定的数据集中。传统的离群值检测方法是基于统计分析的。但是,这种传统方法有一个固有的缺点-它要求整个数据集的可用性。实际上,特别是在实时数据馈送应用程序中,等待所有数据并不是那么现实,因为新数据正在非常快速地流式传输。因此,异常检测是分批完成的。但是,可能会出现两个缺点:由于大小庞大,处理时间相对较长,而且在连续更新之间,结果可能很快就过时了。在本文中,我们提出了几种新颖的增量方法来有效地处理实时数据,以进行离群值检测。对于实验,我们测试了三种分析数据集的机制,即全局分析,累积分析和带滑动窗口的轻量分析。实验数据集是“家庭功耗”,这是大规模在线分析的流行基准数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号