首页> 外文会议>ACM conference on information and knowledge management >BagBoo: A Scalable Hybrid Bagging-the-Boosting Model
【24h】

BagBoo: A Scalable Hybrid Bagging-the-Boosting Model

机译:BAGBOBO:可伸缩的混合袋 - 升压模型

获取原文

摘要

In this paper, we introduce a novel machine learning approach for regression based on the idea of combining bagging and boosting that we call BagBoo. Our BagBoo model borrows its high accuracy potential from Friedman's gradient boosting [2], and high efficiency and scalability through parallelism from Breiman's bagging [1]. We run empirical evaluations on large scale Web ranking data, and demonstrate that BagBoo is not only showing superior relevance than standalone bagging or boosting, but also outperforms most previously published results on these data sets. We also emphasize that BagBoo is intrinsically scalable and paral-lelizable, allowing us to train order of half a million trees on 200 nodes in 2 hours CPU time and beat all of the competitors in the Internet Mathematics relevance competition sponsored by Yandex and be one of the top algorithms in both tracks of Yahoo ICML-2010 challenge. We conclude the paper by stating that while impressive experimental evaluation results are presented here in the context of regression trees, the hybrid BagBoo model is applicable to other domains, such as classification, and base training models.
机译:在本文中,我们基于组合袋装和提升的思想,引入了一种新颖的机器学习方法,以便掌握包包。我们的BAGBOBoo Model借用弗里曼的渐变升高[2]的高精度潜力,以及通过Braing的Bagging [1]的并行性高效率和可扩展性[1]。我们运行在大型网站排名数据的实证评价,并表明BagBoo不仅显示出比独立装袋或提高出众的相关性,但也优于对这些数据集最先前公布的结果。我们还强调,BAGBOO是本质上可扩展的和近距离宽松的,允许我们在2小时的CPU时间在200个节点上培训半个节点的半百万棵树,并击败由Yandex赞助的互联网数学相关竞争中的所有竞争对手。雅虎ICML-2010挑战赛轨的顶级算法。我们通过说明这里在回归树上呈现令人印象深刻的实验评估结果的同时,Hybrid Bagboo模型适用于其他域,例如分类和基础训练模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号