...
首页> 外文期刊>Cluster computing >AEGEUS++: an energy-aware online partition skew mitigation algorithm for mapreduce in cloud
【24h】

AEGEUS++: an energy-aware online partition skew mitigation algorithm for mapreduce in cloud

机译:aegeus ++:云中MapReduce的能量感知在线分区Skew缓解算法

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

This paper investigates the partition skewproblem at reduce phase in the mapreduce jobs. Our study summarize the skew problem in both offline and online manner. Offline is a heuristics based approach waits for the completion of map tasks and it involves computation overhead to estimate the partition size. In online approach, the overloaded tasks are distributed across other nodes that needs extra split and merge operations. These extra operations and ineffective utilization of resources in turn hamper the performance of the entire system. In this paper, we propose Aegeus++, to address the skew mitigation and adaptive data sampling problems for mapreduce jobs which enables to build an online prediction model with improved accuracy in minimal waiting time. In addition, we propose near linear skew detection and fine-grained Resource Allocation algorithms for identifying the skewed partition and allocating appropriate resources to reducers based on the partition size. Finally, our energy-aware opportunistic frequency tuning algorithm improves the performance of the reducer container on-fly, that can process the skewed data faster with minimal energy consumption. We evaluated Aegeus++ in the cloud setup by using benchmark datasets, compared its performance with native Hadoop and its other approaches. Based on our observation, Aegeus++ outperforms native Hadoop by 44% by maximizing its overall performance of the application and decreases the energy consumption by 37.67% when compared with existing approaches.
机译:本文调查了MapReduce作业中缩小阶段的分区SkewProblem。我们的研究总结了离线和在线方式的偏差问题。离线是基于启发式的方法等待完成地图任务,它涉及计算开销以估计分区大小。在在线方法中,重载的任务分布在需要额外分割和合并操作的其他节点上。这些额外的操作和资源的无效利用率妨碍了整个系统的性能。在本文中,我们提出了Aegeus ++,以解决MapReduce作业的偏斜缓解和自适应数据采样问题,这使得能够在最小的等待时间内具有提高的准确性来构建在线预测模型。此外,我们提出了用于识别偏斜分区的线性偏斜检测和细粒度资源分配算法,并根据分区大小分配适当的资源。最后,我们的能量感知机会频率调整算法可直接降低减速机容器的性能,这可以通过最小的能量消耗来处理偏斜数据。我们使用基准数据集在云设置中评估了Aegeus ++,并将其与本机Hadoop及其其他方法进行了比较。基于我们的观察,Aegeus ++优于目的Hadoop,通过最大限度地提高其整体性能,与现有方法相比,将其整体性能降低了37.67%的能源消耗。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号