首页> 外文会议>IEEE International Conference on Smart Cloud >Improving MapReduce Performance with Progress and Feedback Based Speculative Execution
【24h】

Improving MapReduce Performance with Progress and Feedback Based Speculative Execution

机译:通过基于进度和反馈的推测执行来提高MapReduce性能

获取原文

摘要

Task stragglers dramatically impede parallel job execution of data-intensive computing in Cloud Datacenters Due to the uneven distribution of input data resulted from heterogeneous data nodes, resource contention situations, and network configurations, it causes delay failures due to the violation of job completion time. However, data-intensive computing frameworks, such as MapReduce or Hadoop, employ a mechanism called speculative execution to deal with the straggler issue, speculative execution provide limited effectiveness because in many cases straggler identification occurs too late within a job lifecycle. Identifying the straggler and the timing of identifying it is very important for Straggler mitigation in Data-intensive cloud computing. Speculative execution method is a widely adopted as a straggler identification and mitigation scheme but it has certain inherent limitations. In this paper, we strive to make Hadoop more efficient in cloud environments. We present Progress and Feedback based Speculative Execution Algorithm (PFSE), a new Straggler identification scheme to identify the straggler MapReduce tasks based on the feedback information received from completed tasks beside the progress of the currently processing task, our extensive simulation shows that PFSE can outperform the dynamic scheduling techniques like Self-Learning MapReduce scheduler (SLM) and LATE. PFSE can assist in enhancing straggler Identification and mitigation for tolerating late-timing failures within data intensive cloud computing.
机译:任务散漫者极大地阻碍了Cloud Datacenters中数据密集型计算的并行作业执行。由于异构数据节点,资源争用情况和网络配置导致输入数据分布不均,由于违反作业完成时间,导致延迟失败。但是,诸如MapReduce或Hadoop之类的数据密集型计算框架采用一种称为推测执行的机制来处理散乱的问题,因为在许多情况下散乱的标识在工作生命周期中发生得太晚,所以推测执行的效果有限。识别散乱者及其识别时间对于缓解数据密集型云计算中的散乱者非常重要。推测执行方法被广泛用作散乱的识别和缓解方案,但是它具有某些固有的局限性。在本文中,我们致力于使Hadoop在云环境中更加高效。我们提出了基于进度和反馈的推测执行算法(PFSE),这是一种新的Straggler识别方案,可根据已完成任务的反馈信息,根据当前处理任务的进度来识别Straggler MapReduce任务,我们广泛的仿真表明PFSE的性能优于自学习MapReduce调度程序(SLM)和LATE等动态调度技术。 PFSE可以帮助增强散乱者的识别和缓解能力,以容忍数据密集型云计算中的后定时失败。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号