【24h】

Accelerating GPU-based Evolutionary Induction of Decision Trees - Fitness Evaluation Reuse

机译:加快基于GPU的进化诱导决策树 - 健身评估重用

获取原文

摘要

The rapid development of new technologies and parallel frameworks is a chance to overcome barriers of slow evolutionary induction of decision trees (DTs). This global approach, that searches for the tree structure and tests simultaneously, is an emerging alternative to greedy top-down solutions. However, in order to be efficiently applied to big data mining, both technological and algorithmic possibilities need to be fully exploited. This paper shows how by reusing information from previously evaluated individuals, we can accelerate GPU-based evolutionary induction of DTs on large-scale datasets even further. Noting that some of the trees or their parts may reappear during the evolutionary search, we have created a so-called repository of trees (split between GPU and CPU). Experimental evaluation is carried out on the existing Global Decision Tree system where the fitness calculations are delegated to the GPU, while the core evolution is run sequentially on the CPU. Results demonstrate that reusing information about trees from the repository (classification errors, objects' locations, etc.) can accelerate the original GPU-based solution. It is especially visible on large-scale data where the cost of the trees evaluation exceeds the cost of storing and exploring the repository.
机译:新技术和平行框架的快速发展是克服决策树(DTS)缓慢进化诱导的障碍的机会。这种全局方法,该方法同时搜索树结构和测试,是贪婪的自上而下解决方案的新发现替代方案。然而,为了有效地应用于大数据挖掘,需要完全利用技术和算法可能性。本文显示了如何通过从先前评估的个人中重新使用信息,我们可以进一步加速基于GPU的进化诱导DTS。注意到一些树木或其部分可能会在进化搜索期间重新出现,我们创建了一个所谓的树木库(在GPU和CPU之间分开)。实验评估在现有的全局决策树系统上进行,其中将健身计算委派给GPU,而核心演进在CPU上顺序运行。结果表明,从存储库(分类错误,对象'位置等)中重用有关树木的信息可以加速基于GPU的解决方案。它在大型数据上特别可见,树木评估的成本超过存储和探索存储库的成本。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号