Efficient Gradient Boosted Decision Tree Training on GPUs

机译：GPU上有效的渐变提升决策树训练

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we present a novel parallel implementation for training Gradient Boosting Decision Trees (GBDTs) on Graphics Processing Units (GPUs). Thanks to the wide use of the open sourced XGBoost library, GBDTs have become very popular in recent years and won many awards in machine learning and data mining competitions. Although GPUs have demonstrated their success in accelerating many machine learning applications, there are a series of key challenges of developing a GPU-based GBDT algorithm, including irregular memory accesses, many small sorting operations and varying data parallel granularities in tree construction. To tackle these challenges on GPUs, we propose various novel techniques (including Run-length Encoding compression and thread/block workload dynamic allocation, and reusing intermediate training results for efficient gradient computation). Our experimental results show that our algorithm named GPU-GBDT is often 10 to 20 times faster than the sequential version of XGBoost, and achieves 1.5 to 2 times speedup over a 40 threaded XGBoost running on a relatively high-end workstation of 20 CPU cores. Moreover, GPU-GBDT outperforms its CPU counterpart by 2 to 3 times in terms of performance-price ratio.

机译：在本文中，我们提出了一种用于训练梯度升压决策树（GBDT）的新颖的并行实现，在图形处理单元（GPU）上。由于广泛使用开放的XGBoost图书馆，近年来GBDT已经变得非常受欢迎，并在机器学习和数据挖掘比赛中获得了许多奖项。虽然GPU已经证明了他们在加速许多机器学习应用方面的成功，但是开发基于GPU的GBDT算法的一系列关键挑战，包括不规则的存储器访问，许多小分类操作和树结构中的不同数据并行粒度。为了解决GPU上的这些挑战，我们提出了各种新颖的技术（包括运行长度编码压缩和线程/块工作负载动态分配，并重用中间训练结果以获得有效的梯度计算）。我们的实验结果表明，我们名为GPU-GBDT的算法通常比XGBoost的顺序版本快10到20倍，并在一个在20 CPU核心的相对高端工作站上运行的40个线程XGBoost加速1.5到2倍。此外，在性能 - 价格比方面，GPU-GBDT优于2至3次CPU对应。

著录项

来源
《IEEE International Parallel and Distributed Processing Symposium》|2018年|588p|共10页
会议地点
作者
Zeyi Wen; Bingsheng He; Ramamohanarao Kotagiri; Shengliang Lu; Jiashuai Shi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.133;
关键词
Training; Graphics processing units; Parallel processing; Decision trees; Instruction sets; Machine learning algorithms; Machine learning;

机译：培训;图形处理单元;并行处理;决策树;指令集;机器学习算法;机器学习;

相似文献

外文文献
中文文献
专利

1. Exploiting GPUs for Efficient Gradient Boosting Decision Tree Training [J] . Wen Zeyi, Shi Jiashuai, He Bingsheng, IEEE Transactions on Parallel and Distributed Systems . 2019,第12期

机译：利用GPU进行有效的梯度提升决策树训练
2. FPGA and GPU-based acceleration of ML workloads on Amazon cloud - A case study using gradient boosted decision tree library [J] . Shepovalov Maxim, Akella Venkatesh Integration . 2020,第Jana期

机译：亚马逊云上基于FPGA和GPU的ML工作负载加速-使用梯度提升决策树库的案例研究
3. Finding Influential Training Samples for Gradient Boosted Decision Trees [J] . Boris Sharchilev, Yury Ustinovskiy, Pavel Serdyukov, JMLR: Workshop and Conference Proceedings . 2018,第1期

机译：为梯度提升决策树寻找有影响力的训练样本
4. Efficient Gradient Boosted Decision Tree Training on GPUs [C] . Zeyi Wen, Bingsheng He, Ramamohanarao Kotagiri, IEEE International Parallel and Distributed Processing Symposium . 2018

机译：在GPU上进行高效的梯度增强决策树训练
5. Gradient Boosted Regression Tree Methods for Semicontinuous Data [D] . Deshmukh, Sanket . 2020

机译：半连续数据的渐变提升回归树方法
6. A Hyperspectral Image Classification Approach Based on Feature Fusion and Multi-Layered Gradient Boosting Decision Trees [O] . Shenyuan Xu, Size Liu, Hua Wang, 2021

机译：基于特征融合和多层梯度升压决策树的高光谱图像分类方法
7. Inclusion of genetic variants in an ensemble of gradient boosting decision trees does not improve the prediction of citalopram treatment response [O] . Jason Shumake, Travis T. Mallard, John E. McGeary, 2021

机译：在梯度升压决策树的集合中包含遗传变异并不改善基于科内替普治疗响应的预测

Efficient Gradient Boosted Decision Tree Training on GPUs

摘要

著录项

相似文献

相关主题

期刊订阅