首页> 外文会议>2014 IEEE Network Operations and Management Symposium : Management in a Software-Defined World >Optimizing cost and performance trade-offs for MapReduce job processing in the cloud
【24h】

Optimizing cost and performance trade-offs for MapReduce job processing in the cloud

机译:在云中为MapReduce作业处理优化成本和性能折衷

获取原文
获取原文并翻译 | 示例

摘要

Cloud computing offers a new, attractive option to customers for provisioning a suitable size Hadoop cluster, consuming resources as a service, executing the MapReduce workload, and paying for the time these resources were used. One of the open questions in such environments is the choice and the amount of resources that a user should lease from the service provider. In this work1, we offer a framework for evaluating and selecting the right underlying platform (e.g., small, medium, or large EC2 instances) and achieving the desirable Service Level Objectives (SLOs). A user can define a set of different SLOs: i) achieving a given completion time for a set of MapReduce jobs while minimizing the cost (budget), or ii) for a given budget select the type and the number of instances that optimize the MapReduce workload performance (i.e., the completion time of the jobs). We demonstrate that the application performance of a customer workload may vary significantly on different platforms. This makes a selection of the best cost/performance platform for a given workload being a challenging problem. Our evaluation study and experiments with Amazon EC2 platform reveal that for different workload mixes the optimized platform choice may result in 37–70% cost savings for achieving the same performance objectives when using different (but seemingly equivalent) choices. The results of our simulation study are validated through experiments with Hadoop clusters deployed on different Amazon EC2 instances.
机译:云计算为客户提供了一个新的,有吸引力的选择,用于配置合适大小的Hadoop集群,将资源作为服务使用,执行MapReduce工作负载并为使用这些资源的时间付费。在这样的环境中,未解决的问题之一是用户应从服务提供商那里租赁资源的选择和数量。在本工作 1 中,我们提供了一个框架,用于评估和选择正确的基础平台(例如,小型,中型或大型EC2实例),并实现理想的服务水平目标(SLO)。用户可以定义一组不同的SLO:i)在最小化成本(预算)的同时,为一组MapReduce作业实现给定的完成时间,或ii)对于给定的预算,选择优化MapReduce的实例的类型和数量工作负载性能(即作业的完成时间)。我们证明了客户工作负载的应用程序性能在不同平台上可能会有很大差异。对于给定的工作负载,这使得选择最佳成本/性能平台成为一个难题。我们对Amazon EC2平台进行的评估研究和实验表明,对于不同的工作负载混合,使用不同(但看似等效)的选择,实现相同的性能目标,优化平台的选择可以节省37–70%的成本。通过对部署在不同Amazon EC2实例上的Hadoop集群进行的实验,我们验证了模拟研究的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号