首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Cost-Effective Cloud Server Provisioning for Predictable Performance of Big Data Analytics
【24h】

Cost-Effective Cloud Server Provisioning for Predictable Performance of Big Data Analytics

机译:具有成本效益的云服务器配置,可预测大数据分析的性能

获取原文
获取原文并翻译 | 示例

摘要

Cloud datacenters are underutilized due to server over-provisioning. To increase datacenter utilization, cloud providers offer users an option to run workloads such as big data analytics on the underutilized resources, in the form of cheap yet revocable transient servers (e.g., EC2 spot instances, GCE preemptible instances). Though at highly reduced prices, deploying big data analytics on the unstable cloud transient servers can severely degrade the job performance due to instance revocations. To tackle this issue, this paper proposes iSpot, a cost-effective transient server provisioning framework for achieving predictable performance in the cloud, by focusing on Spark as a representative Directed Acyclic Graph (DAG)-style big data analytics workload. It first identifies the stable cloud transient servers during the job execution by devising an accurate Long Short-Term Memory (LSTM)-based price prediction method. Leveraging automatic job profiling and the acquired DAG information of stages, we further build an analytical performance model and present a lightweight critical data checkpointing mechanism for Spark, to enable our design of iSpot provisioning strategy for guaranteeing the job performance on stable transient servers. Extensive prototype experiments on both EC2 spot instances and GCE preemptible instances demonstrate that, iSpot is able to guarantee the performance of big data analytics running on cloud transient servers while reducing the job budget by up to 83.8 percent in comparison to the state-of-the-art server provisioning strategies, yet with acceptable runtime overhead.
机译:由于服务器超额配置,云数据中心未得到充分利用。为了提高数据中心利用率,云提供商为用户提供了一种选择,可以以廉价但可撤销的临时服务器(例如,EC2竞价型实例,GCE可抢占实例)的形式在未充分利用的资源上运行工作负载,例如大数据分析。尽管以低价出售,但由于实例吊销,在不稳定的云暂态服务器上部署大数据分析会严重降低工作性能。为了解决此问题,本文提出了iSpot,这是一种经济高效的瞬态服务器配置框架,通过将Spark视为有代表性的有向非循环图(DAG)风格的大数据分析工作负载,来在云中实现可预测的性能。它首先通过设计准确的基于长期短期记忆(LSTM)的价格预测方法来识别作业执行过程中的稳定云临时服务器。利用自动作业概要分析和获取的阶段DAG信息,我们进一步建立了一个分析性能模型,并为Spark提供了一种轻量级的关键数据检查点机制,以支持iSpot供应策略的设计,以确保稳定的瞬态服务器上的作业性能。在EC2竞价型实例和GCE可抢占型实例上进行的大量原型实验表明,iSpot能够保证在云暂态服务器上运行的大数据分析的性能,同时与最新状态相比,最多可减少83.8%的工作预算。最新的服务器配置策略,但具有可接受的运行时开销。

著录项

  • 来源
  • 作者单位

    East China Normal Univ, Dept Comp Sci & Technol, Shanghai Key Lab Multidimens Informat Proc, 3663 N Zhongshan Rd, Shanghai 200062, Peoples R China;

    East China Normal Univ, Dept Comp Sci & Technol, Shanghai Key Lab Multidimens Informat Proc, 3663 N Zhongshan Rd, Shanghai 200062, Peoples R China;

    East China Normal Univ, Dept Comp Sci & Technol, Shanghai Key Lab Multidimens Informat Proc, 3663 N Zhongshan Rd, Shanghai 200062, Peoples R China;

    East China Normal Univ, Dept Comp Sci & Technol, Shanghai Key Lab Multidimens Informat Proc, 3663 N Zhongshan Rd, Shanghai 200062, Peoples R China;

    Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Cluster & Grid Comp Lab, Serv Comp Technol & Syst Lab, 1037 Luoyu Rd, Wuhan 430074, Hubei, Peoples R China;

    Sun Yat Sen Univ, Sch Data & Comp Sci, Guangdong Key Lab Big Data Anal & Proc, 132 E Waihuan Rd, Guangzhou 510006, Guangdong, Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Predictable performance; big data analytics; cloud computing; transient server provisioning; data checkpointing;

    机译:可预测性能;大数据分析;云计算;瞬态服务器供应;数据检查点;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号