Improving Performance of Heterogeneous MapReduce Clusters with Adaptive Task Tuning

Dazhao Cheng; Jia Rao; Yanfei Guo; Changjun Jiang; Xiaobo Zhou

首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Improving Performance of Heterogeneous MapReduce Clusters with Adaptive Task Tuning

【24h】

Improving Performance of Heterogeneous MapReduce Clusters with Adaptive Task Tuning

机译：通过自适应任务调整提高异构MapReduce集群的性能

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Datacenter-scale clusters are evolving toward heterogeneous hardware architectures due to continuous server replacement. Meanwhile, datacenters are commonly shared by many users for quite different uses. It often exhibits significant performance heterogeneity due to multi-tenant interferences. The deployment of MapReduce on such heterogeneous clusters presents significant challenges in achieving good application performance compared to in-house dedicated clusters. As most MapReduce implementations are originally designed for homogeneous environments, heterogeneity can cause significant performance deterioration in job execution despite existing optimizations on task scheduling and load balancing. In this paper, we observe that the homogeneous configuration of tasks on heterogeneous nodes can be an important source of load imbalance and thus cause poor performance. Tasks should be customized with different configurations to match the capabilities of heterogeneous nodes. To this end, we propose a self-adaptive task tuning approach, Ant, that automatically searches the optimal configurations for individual tasks running on different nodes. In a heterogeneous cluster, Ant first divides nodes into a number of homogeneous subclusters based on their hardware configurations. It then treats each subcluster as a homogeneous cluster and independently applies the self-tuning algorithm to them. Ant finally configures tasks with randomly selected configurations and gradually improves tasks configurations by reproducing the configurations from best performing tasks and discarding poor performing configurations. To accelerate task tuning and avoid trapping in local optimum, Ant uses genetic algorithm during adaptive task configuration. Experimental results on a heterogeneous physical cluster with varying hardware capabilities show that Ant improves the average job completion time by 31, 20, and 14 percent compared to stock Hadoop (Stock), customized Hadoop with industry recommendations (Heuristic), and a profiling-based configuration approach (Starfish), respectively. Furthermore, we extend Ant to virtual MapReduce clusters in a multi-tenant private cloud. Specifically, Ant characterizes a virtual node based on two measured performance statistics: I/O rate and CPU steal time. It uses k-means clustering algorithm to classify virtual nodes into configuration groups based on the measured dynamic interference. Experimental results on virtual clusters with varying interferences show that Ant improves the average job completion time by 20, 15, and 11 percent compared to Stock, Heuristic and Starfish, respectively.

机译：由于不断更换服务器，数据中心规模的集群正在向异构硬件架构发展。同时，数据中心通常由许多用户共享以用于完全不同的用途。由于多租户的干扰，它通常表现出显着的性能异质性。与内部专用集群相比，在此类异构集群上部署MapReduce提出了巨大的挑战，以实现良好的应用程序性能。由于大多数MapReduce实施方案最初都是为同类环境设计的，因此尽管对任务调度和负载平衡进行了现有优化，但异构性仍可能导致作业执行中的性能显着下降。在本文中，我们观察到异构节点上任务的同构配置可能是负载不平衡的重要来源，从而导致性能不佳。应该使用不同的配置来定制任务，以匹配异构节点的功能。为此，我们提出了一种自适应任务调整方法Ant，该方法会自动搜索在不同节点上运行的单个任务的最佳配置。在异构集群中，Ant首先根据节点的硬件配置将其划分为多个同类子集群。然后，它将每个子集群视为同构集群，并对其独立应用自调整算法。 Ant最终使用随机选择的配置来配置任务，并通过从性能最佳的任务中复制配置并丢弃性能较差的配置来逐步改善任务配置。为了加快任务调整速度并避免陷入局部最优状态，Ant在自适应任务配置期间使用了遗传算法。在具有不同硬件功能的异构物理集群上的实验结果表明，与股票Hadoop（股票），具有行业建议的定制Hadoop（启发式）和基于分析的基于Hadoop的软件相比，Ant将平均作业完成时间提高了31％，20％和14％。配置方法（海星）分别。此外，我们将Ant扩展到多租户私有云中的虚拟MapReduce集群。具体来说，Ant基于两个测得的性能统计数据来表征虚拟节点：I / O速率和CPU窃取时间。它使用k均值聚类算法，根据测得的动态干扰将虚拟节点分类为配置组。在具有不同干扰的虚拟集群上的实验结果表明，与Stock，启发式和海星相比，Ant将平均作业完成时间分别提高了20％，15％和11％。

著录项

来源
《IEEE Transactions on Parallel and Distributed Systems》 |2017年第3期|774-786|共13页
作者
Dazhao Cheng; Jia Rao; Yanfei Guo; Changjun Jiang; Xiaobo Zhou;
展开▼
作者单位

Department of Computer Science, University of North Carolina at Charlotte, NC;

Department of Computer Science, University of Colorado, Colorado Springs, CO;

Postdoc Fellow in the Argonne National Lab, Lemont, IL;

Department of Computer Science & Technology, Tongji University, Jiading, Shanghai, China;

Department of Computer Science, University of Colorado, Colorado Springs, CO;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Tuning; Hardware; Cloud computing; Interference; Optimization; Clustering algorithms; Industries;

机译：调优;硬件;云计算;干扰;优化;聚类算法;行业;

相似文献

外文文献
中文文献
专利

1. Design adaptive task allocation scheduler to improve MapReduce performance in heterogeneous clouds [J] . Yang Shin-Jer, Chen Yi-Ru Journal of network and computer applications . 2015,第NOVa期

机译：设计自适应任务分配调度程序以提高异构云中的MapReduce性能
2. Performance Improvement of MapReduce for Heterogeneous Clusters Based on Efficient Locality and Replica Aware Scheduling (ELRAS) Strategy [J] . Benifa J. V. Bibal, Dejey Wireless personal communications: An Internaional Journal . 2017,第3期

机译：基于高效地点和副本意识到（ELRAS）策略的异构集群Mapreduce的性能改进
3. Task failure resilience technique for improving the performance of MapReduce in Hadoop [J] . Kavitha C, Anita X ETRI journal . 2020,第5期

机译：提高Hadoop中MapReduce性能的任务故障恢复技术
4. Improving MapReduce Performance through Complexity and Performance Based Data Placement in Heterogeneous Hadoop Clusters [C] . Rajashekhar M. Arasanal, Daanish U. Rumani International conference on distributed computing and internet technologies . 2013

机译：通过复杂性和异构Hadoop集群中基于性能的数据放置来提高MapReduce性能
5. Improving MapReduce performance in large-scale clusters. [D] . Ahmad, Faraz. 2013

机译：改善大型集群中的MapReduce性能。
6. Non-motor tasks improve adaptive brain-computer interface performance in users with severe motor impairment [O] . Josef Faller, Reinhold Scherer, Elisabeth V. C. Friedrich, 2014

机译：非运动任务可改善患有严重运动障碍的用户的自适应脑机接口性能
7. Improving MapReduce Performance through Data Placement in Heterogeneous Hadoop Clusters [O] . Jiong Xie, Shu Yin, Xiaojun Ruan, 2011

机译：通过在异构Hadoop集群中放置数据来提高MapReduce性能

Improving Performance of Heterogeneous MapReduce Clusters with Adaptive Task Tuning

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅