...
首页> 外文期刊>Expert Systems with Application >Detection of evolving concepts in non-stationary data streams: A multiple kernel learning approach
【24h】

Detection of evolving concepts in non-stationary data streams: A multiple kernel learning approach

机译:检测非平稳数据流中不断发展的概念:多核学习方法

获取原文
获取原文并翻译 | 示例
           

摘要

Due to the unprecedented speed and volume of generated raw data in most of applications, data stream mining has attracted a lot of attention recently. Methods for solving these problems should address challenges in this area such as infinite length, concept-drift, recurring concepts, and concept-evolution. Moreover, due to the speedy intrinsic of data streams, the time and space complexity of the methods are extremely important. This paper proposes a novel method based on multiple-kernels for classifying non-stationary data streams, which addresses the mentioned challenges with special attention to the space complexity. By learning multiple kernels and specifying the boundaries of classes in the feature (mapped) space of combined kernels, the required amount of memory will be decreased, These kernels will be updated regularly throughout the stream when the true labels of instances are received. Newly arrived instances will be classified with respect to their distance to boundaries of the previously known classes in the feature spaces. Due to the efficient memory usage, the computation time does not increase significantly through the stream. We evaluate the performance of the proposed method using a set of experiments conducted on both real and synthetic benchmark data sets. The experimental results show the superiority of the proposed method over the state-of-the-art methods in this area. (C) 2017 Elsevier Ltd. All rights reserved.
机译:由于在大多数应用程序中生成原始数据的速度和数量是空前的,因此数据流挖掘最近引起了很多关注。解决这些问题的方法应解决这一领域中的挑战,例如无限长,概念漂移,重复出现的概念以及概念演变。此外,由于数据流的快速内在性,这些方法的时间和空间复杂性非常重要。本文提出了一种基于多核的非平稳数据流分类方法,该方法特别针对空间复杂性,解决了上述问题。通过学习多个内核并在组合内核的特征(映射)空间中指定类的边界,将减少所需的内存量。当收到实例的真实标签时,将在整个流中定期更新这些内核。将根据新到达的实例到特征空间中它们与先前已知类的边界的距离进行分类。由于有效的内存使用,流中的计算时间不会明显增加。我们使用对真实和综合基准数据集进行的一组实验来评估所提出方法的性能。实验结果表明,该方法在该领域优于最新方法。 (C)2017 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号