首页> 外文会议>International Joint Conference on Neural Networks >GPU-based State Adaptive Random Forest for Evolving Data Streams
【24h】

GPU-based State Adaptive Random Forest for Evolving Data Streams

机译:基于GPU的状态自适应随机森林,用于不断发展的数据流

获取原文

摘要

Random forest is an ensemble method used to improve the performance of single tree classifiers. In evolving data streams, the classifier needs to be adaptive and work under constraints of space and time. One benefit of random forest is its ability to be executed in parallel. In our research we introduce a random forest model utilizing a hybrid of both GPU and CPU, called GPU-based State-Adaptive Random Forest (GSARF). We address the pre-existing challenges of adapting random forest for data streams, specifically in the area of continual learning. Our novel approach reuses previously seen trees in the random forest when previous concepts reappear. This allows us to retain prior knowledge and provide a more stable predictive accuracy when changes occur in the data stream. Our random forest for data streams stores three types of trees, foreground trees which are trees that are currently used in prediction, background trees which are trees that are built when we are aware of possible changes in the data streams, and candidate trees which are trees that had been highly used in the previous concepts, but are now discarded due to changes in the data stream. We store candidate trees as they may be potentially useful at a later period in a repository and can be accessed when needed. We empirically show our technique performs up to 138 times the speed compared to current CPU-based random forest benchmarks. Our approach has shown to outperform a baseline GPU-based approach in terms of cumulative accuracy performance.
机译:随机森林是一种用于提高单树分类器性能的集成方法。在不断发展的数据流中,分类器需要具有自适应性,并且必须在时空的约束下工作。随机森林的好处之一是可以并行执行。在我们的研究中,我们介绍了一种利用GPU和CPU混合的随机森林模型,称为基于GPU的状态自适应随机森林(GSARF)。我们解决了将随机森林适应数据流的现有挑战,特别是在持续学习领域。当以前的概念重新出现时,我们的新颖方法会在随机森林中重用以前看到的树。这使我们能够保留先验知识,并在数据流中发生更改时提供更稳定的预测准确性。我们的数据流随机森林存储三种类型的树,即前景树(当前用于预测的树),背景树(当我们了解数据流的可能变化时构建的树)和候选树(即树)在以前的概念中已被广泛使用,但由于数据流中的更改而被丢弃。我们存储候选树,因为它们可能在以后的某个版本中可能在存储库中有用,并且可以在需要时进行访问。我们凭经验表明,与当前基于CPU的随机森林基准测试相比,我们的技术可将速度提高138倍。我们的方法在累积精度性能方面已显示出优于基于GPU的基准方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号