首页> 外文会议>IEEE International Conference on Big Data >Clustering-Driven and Dynamically Diversified Ensemble for Drifting Data Streams
【24h】

Clustering-Driven and Dynamically Diversified Ensemble for Drifting Data Streams

机译:集群驱动和动态多样化的集成,以漂移数据流

获取原文

摘要

Data stream mining is a rapidly developing branch of contemporary machine learning. Ensemble approaches have proven themselves to be highly effective in this domain, due to their predictive power and capabilities for handling evolving data. One of the key aspects of ensemble learning is diversity among base classifiers - it improves accuracy and allows for anticipating and recovering from concept drifts. It has been shown that while diversity is desirable during changes, it may impede learning when data becomes stationary. In this paper, we present a novel ensemble technique that exploits the idea of dynamic diversification, which increases diversity during changes and reduces it when a stream becomes stable. The algorithm uses online clustering for this task by creating locally specialized base learners trained on spatially related instances. Three control strategies based on the novel range heuristic for managing a trade-off between error (a change indicator) and diversity are utilized. Additionally, two intensification strategies are proposed for exploitation of newly arriving instances, allowing for faster adaptation. Experimental study evaluates the general performance and diversity of the proposed algorithm, proving its capabilities to outperform state-of-the-art ensembles dedicated to drifting data stream mining.
机译:数据流挖掘是当代机器学习的一个快速发展的分支。集成方法由于具有预测能力和处理不断发展的数据的能力,因此在该领域已证明是非常有效的。集成学习的关键方面之一是基本分类器之间的多样性-它提高了准确性,并允许预测和从概念漂移中恢复。已经表明,尽管在变化过程中需要多样性,但是当数据变得稳定时,它可能会阻碍学习。在本文中,我们提出了一种新颖的集成技术,该技术利用了动态多样化的思想,该思想在变化期间增加了多样性,而在流变得稳定时减少了多样性。该算法通过创建在空间相关实例上受过训练的本地专业基础学习者,将在线聚类用于此任务。利用了基于新颖的范围启发法的三种控制策略来管理误差(变化指标)和多样性之间的折衷。另外,提出了两种强化策略以利用新到达的实例,从而允许更快的适应。实验研究评估了该算法的总体性能和多样性,证明了其性能优于专门用于漂移数据流挖掘的最新集成。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号