首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Optimization for Speculative Execution in Big Data Processing Clusters
【24h】

Optimization for Speculative Execution in Big Data Processing Clusters

机译:大数据处理集群中的推测执行的优化

获取原文
获取原文并翻译 | 示例

摘要

A big parallel processing job can be delayed substantially as long as one of its many tasks is being assigned to an unreliable or congested machine. To tackle this so-called straggler problem, most parallel processing frameworks such as MapReduce have adopted various strategies under which the system may speculatively launch additional copies of the same task if its progress is abnormally slow when extra idling resource is available. In this paper, we focus on the design of speculative execution schemes for parallel processing clusters from an optimization perspective under different loading conditions. For the lightly loaded case, we analyze and propose one cloning scheme, namely, the Smart Cloning Algorithm (SCA) which is based on maximizing the overall system utility. We also derive the workload threshold under which SCA should be used for speculative execution. For the heavily loaded case, we propose the Enhanced Speculative Execution (ESE) algorithm which is an extension of the Microsoft Mantri scheme to mitigate stragglers. Our simulation results show SCA reduces the total job flowtime, i.e., the job delay/ response time by nearly 6 percent comparing to the speculative execution strategy of Microsoft Mantri. In addition, we show that the ESE Algorithm outperforms the Mantri baseline scheme by 71 percent in terms of the job flowtime while consuming the same amount of computation resource.
机译:只要将大量并行处理作业中的许多任务之一分配给不可靠或拥挤的机器,就可以将其延迟。为了解决这个所谓的散兵问题,大多数并行处理框架(例如MapReduce)都采用了各种策略,在这种策略下,如果有额外的空闲资源可用时,如果其进度异常缓慢,则系统可能会推测性地启动同一任务的其他副本。在本文中,我们从优化角度着眼于不同负载条件下的并行处理集群的投机执行方案的设计。对于轻负载情况,我们分析并提出了一种克隆方案,即基于最大化整体系统实用性的智能克隆算法(SCA)。我们还推导出了将SCA用于推测执行的工作量阈值。对于负载较重的情况,我们提出了增强的推测执行(ESE)算法,该算法是Microsoft Mantri方案的扩展,可以减轻流浪汉。我们的仿真结果表明,与Microsoft Mantri的推测性执行策略相比,SCA减少了总的工作流程时间,即工作延迟/响应时间减少了6%。此外,我们显示,ESE算法在消耗相同数量的计算资源的同时,在工作流程方面比Mantri基线方案要好71%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号