首页> 外文会议>European Conference on Principles and Practice of Knowledge Discovery in Databases >A Random Method for Quantifying Changing Distributions in Data Streams
【24h】

A Random Method for Quantifying Changing Distributions in Data Streams

机译:用于量化数据流中的更改分布的随机方法

获取原文

摘要

In applications such as fraud and intrusion detection, it is of great interest to measure the evolving trends in the data. We consider the problem of quantifying changes between two datasets with class labels. Traditionally, changes are often measured by first estimating the probability distributions of the given data, and then computing the distance, for instance, the K-L divergence, between the estimated distributions. However, this approach is computationally infeasible for large, high dimensional datasets. The problem becomes more challenging in the streaming data environment, as the high speed makes it difficult for the learning process to keep up with the concept drifts in the data. To tackle this problem, we propose a method to quantify concept drifts using a universal model that incurs minimal learning cost. In addition, our model also provides the ability of performing classification.
机译:在欺诈和入侵检测的应用中,测量数据的不断变化的趋势非常感兴趣。我们考虑使用类标签的两个数据集之间量化更改的问题。传统上,通常通过首先估计给定数据的概率分布,然后计算估计分布之间的距离,例如计算距离的距离来测量变化。然而,这种方法对于大型高维数据集来说是可逆的。由于高速使得学习过程难以跟上数据中的概念难度,因此问题变得更具挑战性。为了解决这个问题,我们提出了一种使用引入最小学习成本的普遍模型来量化概念漂移的方法。此外,我们的模型还提供了执行分类的能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号