首页> 外文期刊>IEEE Transactions on Computers >Dynamic Resource Allocation for MapReduce with Partitioning Skew
【24h】

Dynamic Resource Allocation for MapReduce with Partitioning Skew

机译:具有分区偏斜的MapReduce动态资源分配

获取原文
获取原文并翻译 | 示例

摘要

MapReduce has become a prevalent programming model for building data processing applications in the cloud. While being widely used, existing MapReduce schedulers still suffer from an issue known as partitioning skew, where the output of map tasks is unevenly distributed among reduce tasks. Existing solutions follow a similar principle that repartitions workload among reduce tasks. However, those approaches often incur high performance overhead due to the partition size prediction and repartitioning. In this paper, we present DREAMS, a framework that provides run-time partitioning skew mitigation. Instead of repartitioning workload among reduce tasks, we cope with the partitioning skew problem by controlling the amount of resources allocated to each reduce task. Our approach completely eliminates the repartitioning overhead, yet is simple to implement. Experiments using both real and synthetic workloads running on a 21-node Hadoop cluster demonstrate that DREAMS can effectively mitigate the negative impact of partitioning skew, thereby improving the job completion time by up to a factor of 2.29 over the native Hadoop YARN. Compared to the state-of-the-art solution, DREAMS can improve the job completion time by a factor of 1.65 .
机译:MapReduce已成为在云中构建数据处理应用程序的流行编程模型。现有的MapReduce调度程序虽然被广泛使用,但仍然遇到称为分区偏斜的问题,其中map任务的输出在reduce任务之间分布不均。现有解决方案遵循类似的原则,即在简化任务之间重新分配工作负载。但是,由于分区大小的预测和重新分区,这些方法通常会产生高性能开销。在本文中,我们介绍了DREAMS,这是一个提供运行时分区偏斜缓解的框架。与其在缩减任务之间重新分配工作负载,我们不通过控制分配给每个缩减任务的资源量来解决分区偏斜问题。我们的方法完全消除了重新分配的开销,但易于实现。使用在21节点Hadoop群集上运行的实际和合成工作负载进行的实验表明,DREAMS可以有效减轻分区偏斜的负面影响,从而将作业完成时间比本地Hadoop YARN缩短多达2.29倍。与最新解决方案相比,DREAMS可以将作业完成时间缩短1.65倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号