首页> 外文会议>IEEE International Symposium on Software Reliability Engineering >Keep it moving: Proactive workload management for reducing SLA violations in large scale SaaS clouds
【24h】

Keep it moving: Proactive workload management for reducing SLA violations in large scale SaaS clouds

机译:继续移动:主动工作负载管理,用于减少大规模萨斯云中的SLA违规行为

获取原文
获取外文期刊封面目录资料

摘要

Software failures, workload-related failures and job overload conditions bring about SLA violations in software-as-a-service (SaaS) systems. Existing work does not address mitigation of SLA violations completely as (i) none of them address mitigation of SLA violations in business specific scenarios (SaaS, in our case), (ii) while some do not address software and workload-related failures, other approaches do not address the problem of target PM selection for workload migration comprehensively (leaving out vital considerations like workload compatibility checks between migrating VM and VMs at the target PM) and (iii) a clear mathematical mapping between workload, resource demand and SLA is lacking. In this paper, we present the Keep It Moving (KIM) software framework for the cloud controller that helps minimize service failures due to SLA violation of availability, utilization and response time in SaaS cloud data centers. Though we consider migration to be the primary mitigation technique, we also try to mitigate SLA violations without migration. We achieve this by performing a capacity check on the host physical machine (PM) before the migration to identify if enough capacity is available on the current PM to address the upcoming SLA violations by restart/reboot or VM resizing. In certain cases such as workload-related failures due to corrupt files, we prefer workload rerouting to a replica VM over migration. We formulate the selection of a target PM as a multi-objective optimization problem. We validate our proposed approach by using a trace-based discrete event simulation of a virtualized data center where failure and workload characteristics are simulated from data extracted from a real SaaS business server logs. We found that a 60% reduction in SLA violation is possible using our approach as well as reducing VM downtime by approximately 10%.
机译:软件故障,工作负载相关的失败和作业过载条件在软件 - AS-Service(SaaS)系统中带来SLA违规。现有的工作并没有完全解决SLA违规的缓解,因为(i)他们都不是在企业特定场景(SaaS,我们的案例中)的SLA违规的解决,而一些没有解决与软件和工作负载相关的失败,其他方法不会涉及全面的工作负载迁移的目标PM选择问题(丢失了目标PM迁移VM和VM之间的工作负载兼容性检查)和(iii)工作负载之间的清晰数学映射,缺乏工作量,资源需求和SLA之间的清晰数学映射。在本文中,我们介绍了保持云控制器的移动(Kim)软件框架,这有助于最小化SLA违反SAAS云数据中心的可用性,利用率和响应时间的服务故障。虽然我们认为迁移为主要缓解技术,但我们也试图在没有迁移的情况下减轻SLA违规行为。我们通过在迁移之前执行主机物理机器(PM)的容量检查来实现这一目标,以识别当前PM的足够容量以通过重启/重启或VM调整调整到即将到来的SLA违规。在某些情况下,例如由于损坏的文件导致的工作负载相关的失败,我们更喜欢将工作负载重新路由到迁移副本VM。我们制定目标PM作为多目标优化问题的选择。我们通过使用虚拟化数据中心的基于跟踪的离散事件模拟来验证我们的建议方法,其中从真实SaaS业务服务器日志中提取的数据模拟了故障和工作负载特性。我们发现使用我们的方法可以减少60%的SLA违规,以及将VM停机时间减少约10%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号