【24h】

A Random Method for Quantifying Changing Distributions in Data Streams

机译:量化数据流中变化分布的随机方法

获取原文
获取原文并翻译 | 示例

摘要

In applications such as fraud and intrusion detection, it is of great interest to measure the evolving trends in the data. We consider the problem of quantifying changes between two datasets with class labels. Traditionally, changes are often measured by first estimating the probability distributions of the given data, and then computing the distance, for instance, the K-L divergence, between the estimated distributions. However, this approach is computationally infeasible for large, high dimensional datasets. The problem becomes more challenging in the streaming data environment, as the high speed makes it difficult for the learning process to keep up with the concept drifts in the data. To tackle this problem, we propose a method to quantify concept drifts using a universal model that incurs minimal learning cost. In addition, our model also provides the ability of performing classification.
机译:在欺诈和入侵检测等应用程序中,测量数据的发展趋势非常重要。我们考虑用类标签来量化两个数据集之间的变化的问题。传统上,通常通过先估算给定数据的概率分布,然后计算估算的分布之间的距离(例如K-L散度)来测量变化。但是,这种方法对于大型,高维数据集在计算上是不可行的。在流数据环境中,该问题变得更具挑战性,因为高速使得学习过程难以跟上数据中概念的漂移。为了解决这个问题,我们提出了一种使用通用模型来量化概念漂移的方法,该模型产生的学习成本最低。此外,我们的模型还提供了执行分类的能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号