首页> 外文会议>International conference on business information systems >Genetic Programming over Spark for Higgs Boson Classification
【24h】

Genetic Programming over Spark for Higgs Boson Classification

机译:希格斯玻色子分类的Spark遗传编程

获取原文

摘要

With the growing number of available databases having a very large number of records, existing knowledge discovery tools need to be adapted to this shift and new tools need to be created. Genetic Programming (GP) has been proven as an efficient algorithm in particular for classification problems. Notwithstanding, GP is impaired with its computing cost that is more acute with large datasets. This paper, presents how an existing GP implementation (DEAP) can be adapted by distributing evaluations on a Spark cluster. Then, an additional sampling step is applied to fit tiny clusters. Experiments are accomplished on Higgs Boson classification with different settings. They show the benefits of using Spark as parallelization technology for GP.
机译:随着越来越多的具有大量记录的可用数据库的出现,现有的知识发现工具需要适应这种变化,并且需要创建新的工具。遗传编程(GP)已被证明是一种有效的算法,特别是对于分类问题。尽管如此,GP的计算成本受到了损害,而大型数据集的计算成本则更为严重。本文介绍了如何通过在Spark集群上分发评估来适应现有的GP实现(DEAP)。然后,执行额外的采样步骤以适应微小的群集。实验是在具有不同设置的希格斯玻色子分类上完成的。他们展示了使用Spark作为GP的并行化技术的好处。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号