Power Efficient MapReduce Workload Acceleration Using Integrated-GPU

机译：使用Integrated-GPU功率高效MapReduce工作负载加速

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the pervasiveness of MapReduce - one of the most prominent programming models for data parallelism in Apache Hadoop-, many researchers and developers have spent tremendous effort attempting to boost the computational speed and energy efficiency of MapReduce-based big data processing. However, the scalable and fault-tolerant nature of MapReduce introduces additional costs in disk IO and data transfer, caused by streaming intermediate outputs to disk. In light of these issues, many interesting research projects have been initiated with the goal of improving the compute speed and power efficiency of compute-intensive cloud computing workloads, several with the addition of discrete GPUs. In this work, we present a modified MapReduce approach focused on the iterative clustering algorithms in the Apache Mahout machine learning library that leverage the acceleration potential of the Intel integrated GPU in a multi-node cluster environment. The accelerated framework shows varying levels of speed-up (?45x for Map tasks-only, ?4.37x for the entire K-means clustering) as evaluated using the HiBench benchmark suite. Based on various experiments and in-depth analysis, we find that utilizing the integrated GPU via OpenCL offers significant performance and power efficiency gains over the original CPU based approach. Further analysis is also done to understand the correlations between compute, IO and power efficiency. As such, our results show that embracing the integrated GPU in the Hadoop MapReduce framework represents a promising advance in adding cost and energy efficient compute parallelism to a data parallel multinode environment.

机译：凭借Mapreduce的普及性 - Apache Hadoop中的数据并行性最突行的模型之一，许多研究人员和开发人员都花费了巨大的努力，试图提高基于MapReduce的大数据处理的计算速度和能源效率。但是，MapReauce的可扩展和容错性质在磁盘IO和数据传输中引入了额外的成本，由将中间输出传输到磁盘引起。根据这些问题，许多有趣的研究项目已经开始提高计算密集型云计算工作负载的计算速度和功率效率，其中几个是添加离散GPU的几个。在这项工作中，我们介绍了一个修改的MapReduce方法，专注于Apache Mahout机器学习库中的迭代聚类算法，它利用了在多节点群集环境中利用英特尔集成GPU的加速电位。加速框架显示了不同级别的加速（映射用于映射任务的45倍，对于使用Hibench基准套件的评估为4.37倍）。基于各种实验和深入分析，我们发现，通过OpenCL利用集成的GPU提供了基于CPU的原始CPU方法的显着性能和功率效率。还进行了进一步分析以了解计算，IO和功率效率之间的相关性。因此，我们的结果表明，在Hadoop MapReduce框架中拥有集成的GPU代表了向数据并行多边形环境增加成本和节能计算并行性的有希望的进展。

著录项

来源
《IEEE International Conference on Big Data Computing Service and Applications》|2015年||共8页
会议地点
作者
Kim SungYe; Bottleson Jeremy; Jin Jingyi; Bindu Preeti; Sakhare Snehal C.; Spisak Joseph S.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP212;
关键词
Big Data; GPGPU; Hadoop; Integrated Graphics; Machine Learning; Mahout; OpenCL;

机译：大数据;GPGPU;HADOOP;集成图形;机器学习;MAHOUT;OPENCL;

相似文献

外文文献
中文文献
专利

1. Energy-efficient acceleration of MapReduce applications using FPGAs [J] . Neshatpour Katayoun, Malik Maria, Sasan Avesta, Journal of Parallel and Distributed Computing . 2018,第SEPa期

机译：使用FPGA的MapReduce应用程序的节能加速
2. Workload acceleration with the IBM POWER vector-scalar architecture [J] . M. Gschwind IBM Journal of Research and Development . 2016,第2a3期

机译：使用IBM POWER矢量标量架构加速工作负载
3. Towards a cost-efficient MapReduce: Mitigating power peaks for Hadoop clusters [J] . Zhu Nan, Liu Xue, Liu Jie, Tsinghua Science and Technology . 2014,第1期

机译：迈向具有成本效益的MapReduce：减轻Hadoop集群的功率峰值
4. Power Efficient MapReduce Workload Acceleration Using Integrated-GPU [C] . Kim SungYe, Bottleson Jeremy, Jin Jingyi, IEEE International Conference on Big Data Computing Service and Applications . 2015

机译：使用集成GPU的高效节电MapReduce工作负载加速
5. Dynamic Workload Balancing and Scheduling in Hadoop MapReduce with Software Defined Networking [D] . Hou, Xiaofei. 2017

机译：Hadoop MapReduce中具有软件定义网络的动态工作负载平衡和调度
6. SOAPnuke: a MapReduce acceleration-supported software for integrated quality control and preprocessing of high-throughput sequencing data [O] . Yuxin Chen, Yongsheng Chen, Chunmei Shi, -1

机译：SOAPnuke：MapReduce加速支持的软件用于集成质量控制和高通量测序数据的预处理
7. Towards Efficient Power Management in MapReduce: Investigation of CPU-Frequencies Scaling on Power Efficiency in Hadoop [O] . Ibrahim, Shadi, Moise, Diana, Chihoub, Houssem-Eddine, 2014

机译：在MapReduce中实现高效的电源管理：研究Hadoop中电源效率的CPU频率扩展

Power Efficient MapReduce Workload Acceleration Using Integrated-GPU

摘要

著录项

相似文献

相关主题

期刊订阅