首页> 外文会议>IEEE International Parallel and Distributed Processing Symposium >SMapReduce: Optimising Resource Allocation by Managing Working Slots at Runtime
【24h】

SMapReduce: Optimising Resource Allocation by Managing Working Slots at Runtime

机译:SMapReduce:通过在运行时管理工作槽来优化资源分配

获取原文

摘要

Hadoop version 1 (HadoopV1) and version 2 (YARN) manage the resources in a distributed system in different ways. HadoopV1 executes MapReduce tasks in working slots that are statically configured, YARN uses a set of task containers to encapsulate its memory and CPU resources. However, neither of them considers the runtime performance of the cluster when deciding the proper number of concurrent tasks to run on each node to achieve the optimal throughput. In order to gain higher performance, the users of Hadoop usually need to use their experience to carefully configure the resources of the cluster and the resources needed by their jobs. But as the workload is typically always changing in the cluster, rarely could such a manual configuration lead to optimized performance. In this paper, we study the MapReduce job performance in HadoopV1 and YARN with different resource configurations, and model the cluster throughput in terms of the resource capacity of the cluster. We propose SMapReduce, which can dynamically manage a proper number of concurrent tasks running on each node. SMapReduce can gain the maximum job throughput by considering the thrashing phenomenon and the balancing between map and reduce tasks. Evaluation results show that SMapReduce can yield significant performance speedup comparing to both HadoopV1 and YARN for various MapReduce workloads.
机译:Hadoop版本1(HadoopV1)和版本2(YARN)以不同的方式管理分布式系统中的资源。 HadoopV1在静态配置的工作插槽中执行MapReduce任务,YARN使用一组任务容器封装其内存和CPU资源。但是,当决定在每个节点上运行适当数量的并发任务以实现最佳吞吐量时,它们都不考虑集群的运行时性能。为了获得更高的性能,Hadoop的用户通常需要利用他们的经验来仔细配置集群的资源和工作所需的资源。但是,由于集群中的工作负载通常总是在变化,因此这种手动配置很少会带来优化的性能。在本文中,我们研究了具有不同资源配置的HadoopV1和YARN中MapReduce作业的性能,并根据集群的资源容量对集群吞吐量进行了建模。我们建议使用SMapReduce,它可以动态管理在每个节点上运行的适当数量的并发任务。通过考虑抖动现象以及map和reduce任务之间的平衡,SMapReduce可以获得最大的作业吞吐量。评估结果表明,与各种MapReduce工作负载的HadoopV1和YARN相比,SMapReduce可以显着提高性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号