...
首页> 外文期刊>Machine Learning >Adaptive random forests for evolving data stream classification
【24h】

Adaptive random forests for evolving data stream classification

机译:自适应随机森林,用于演进的数据流分类

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Random forests is currently one of the most used machine learning algorithms in the non-streaming (batch) setting. This preference is attributable to its high learning performance and low demands with respect to input preparation and hyper-parameter tuning. However, in the challenging context of evolving data streams, there is no random forests algorithm that can be considered state-of-the-art in comparison to bagging and boosting based algorithms. In this work, we present the adaptive random forest (ARF) algorithm for classification of evolving data streams. In contrast to previous attempts of replicating random forests for data stream learning, ARF includes an effective resampling method and adaptive operators that can cope with different types of concept drifts without complex optimizations for different data sets. We present experiments with a parallel implementation of ARF which has no degradation in terms of classification performance in comparison to a serial implementation, since trees and adaptive operators are independent from one another. Finally, we compare ARF with state-of-the-art algorithms in a traditional test-then-train evaluation and a novel delayed labelling evaluation, and show that ARF is accurate and uses a feasible amount of resources.
机译:当前,随机森林是非流式(批处理)设置中最常用的机器学习算法之一。这种偏爱归因于它的高学习性能和对输入准备和超参数调整的低要求。但是,在不断发展的数据流具有挑战性的情况下,与基于装袋和增强的算法相比,没有随机森林算法可以被视为最新技术。在这项工作中,我们提出了自适应随机森林(ARF)算法,用于对不断发展的数据流进行分类。与以前为数据流学习复制随机森林的尝试相反,ARF包括有效的重采样方法和自适应运算符,它们可以应对不同类型的概念漂移,而无需针对不同数据集进行复杂的优化。我们提出的ARF并行实现的实验与串行实现相比,在分类性能方面没有任何下降,因为树和自适应算子彼此独立。最后,我们将ARF与最先进的算法在传统的“先试后训练”评估和新型延迟标签评估中进行了比较,证明ARF是准确的并且使用了可行的资源量。

著录项

  • 来源
    《Machine Learning》 |2017年第10期|1469-1495|共27页
  • 作者单位

    Pontificia Univ Catolica Parana, PPGIa, Curitiba, Parana, Brazil;

    Univ Paris Saclay, Telecom ParisTech, LTCI, Paris, France;

    Univ Paris Saclay, Telecom ParisTech, LTCI, Paris, France|Ecole Polytech, LIX, Palaiseau, France;

    Pontificia Univ Catolica Parana, PPGIa, Curitiba, Parana, Brazil;

    Pontificia Univ Catolica Parana, PPGIa, Curitiba, Parana, Brazil;

    Univ Waikato, Dept Comp Sci, Hamilton, New Zealand;

    Univ Waikato, Dept Comp Sci, Hamilton, New Zealand;

    Univ Paris Saclay, Telecom ParisTech, LTCI, Paris, France|Natl Univ Singapore, IPAL, CNRS, UMI, Singapore, Singapore|Natl Univ Singapore, Sch Comp, Singapore, Singapore;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Data stream mining; Random forests; Ensemble learning; Concept drift;

    机译:数据流挖掘;随机森林;集成学习;概念漂移;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号