...
首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Moving Hadoop into the Cloud with Flexible Slot Management and Speculative Execution
【24h】

Moving Hadoop into the Cloud with Flexible Slot Management and Speculative Execution

机译:通过灵活的插槽管理和推测执行将Hadoop迁移到云中

获取原文
获取原文并翻译 | 示例

摘要

Load imbalance is a major source of overhead in parallel programs such as MapReduce. Due to the uneven distribution of input data, tasks with more data become stragglers and delay the overall job completion. Running Hadoop in a private cloud opens up opportunities for expediting stragglers with more resources but also introduces problems that often outweigh the performance gain: (1) performance interference from co-running jobs may create new stragglers; (2) there exists a semantic gap between the Hadoop task management and resource pool-based virtual cluster management preventing tasks from using resources efficiently. In this paper, we strive to make Hadoop more resilient to data skew and more efficient in cloud environments. We present FlexSlot, a user-transparent task slot management scheme that automatically identifies map stragglers and resizes their slots accordingly to accelerate task execution. FlexSlot adaptively changes the number of slots on each virtual node to balance the resource usage so that the pool of resources can be efficiently utilized. FlexSlot further improves mitigation of data skew with an adaptive speculative execution strategy. Experimental results show that FlexSlot effectively reduces job completion time up to 47.2 percent compared to stock Hadoop and two recently proposed skew mitigation and speculative execution approaches.
机译:负载不平衡是并行程序(例如MapReduce)中开销的主要来源。由于输入数据的分布不均匀,具有更多数据的任务变得比较混乱,并延迟了整体作业的完成。在私有云中运行Hadoop为使用更多资源加速散乱者提供了机会,但同时也带来了往往超过性能提升的问题:(1)联合运行作业对性能的干扰可能会产生新散乱者; (2)Hadoop任务管理与基于资源池的虚拟集群管理之间存在语义鸿沟,从而阻止任务有效利用资源。在本文中,我们努力使Hadoop在云环境中具有更强的抵御数据倾斜能力和效率。我们介绍了FlexSlot,这是一种用户透明的任务插槽管理方案,可自动识别地图散乱者并相应地调整其尺寸以加速任务执行。 FlexSlot自适应地更改每个虚拟节点上的插槽数量以平衡资源使用,从而可以有效地利用资源池。 FlexSlot通过自适应的推测执行策略进一步改善了数据偏斜的缓解。实验结果表明,与股票Hadoop和最近提出的两种偏斜缓解和投机执行方法相比,FlexSlot有效地减少了高达47.2%的作业完成时间。

著录项

相似文献

  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号