首页> 外文会议>IEEE International Conference on Cloud Computing Technology and Science >ReLeaSER: A Reinforcement Learning Strategy for Optimizing Utilization Of Ephemeral Cloud Resources
【24h】

ReLeaSER: A Reinforcement Learning Strategy for Optimizing Utilization Of Ephemeral Cloud Resources

机译:释放器:优化短暂云资源利用的加强学习策略

获取原文

摘要

Cloud data center capacities are over-provisioned to handle demand peaks and hardware failures which leads to low resources' utilization. One way to improve resource utilization and thus reduce the total cost of ownership is to offer unused resources (referred to as ephemeral resources) at a lower price. However, reselling resources needs to meet the expectations of its customers in terms of Quality of Service. The goal is so to maximize the amount of reclaimed resources while avoiding SLA penalties. To achieve that, cloud providers have to estimate their future utilization to provide availability guarantees. The prediction should consider a safety margin for resources to react to unpredictable workloads. The challenge is to find the safety margin that provides the best trade-off between the amount of resources to reclaim and the risk of SLA violations. Most state-of-the-art solutions consider a fixed safety margin for all types of metrics (e.g., CPU, RAM). However, a unique fixed margin does not consider various workloads variations over time which may lead to SLA violations or/and poor utilization. In order to tackle these challenges, we propose ReLeaSER, a Reinforcement Learning strategy for optimizing the ephemeral resources' utilization in the cloud. ReLeaSER dynamically tunes the safety margin at the host-level for each resource metric. The strategy learns from past prediction errors (that caused SLA violations). Our solution reduces significantly the SLA violation penalties on average by $mathbf{2.7}imes$ and up to $mathbf{3.4}imes$. It also improves considerably the CPs' potential savings by 27.6% on average and up to 43.6%.
机译:云数据中心容量被过度配置以处理需求峰值和硬件故障,从而导致资源低利用率。改善资源利用的一种方法,从而降低总体拥有成本是以较低的价格提供未使用的资源(称为季节记录)。但是,转售资源需要在服务质量方面满足客户的期望。目标是为了最大化再生资源的数量,同时避免SLA处罚。为此,云提供商必须估计其未来利用率,以提供可用性保证。预测应考虑资源的安全保证金对不可预测的工作负载作出反应。挑战是找到安全保证金,提供资源数量与违反SLA违规的风险之间的最佳权衡。最先进的解决方案考虑所有类型的指标的固定安全率(例如,CPU,RAM)。然而,独特的固定边缘不考虑各种工作量随时间的变化,这可能导致SLA违规或/和利用率差。为了解决这些挑战,我们建议释放,加强学习策略,以优化云中的季节季节利用率。释放器在每个资源指标中动态调谐主机级的安全余量。该策略从过去的预测错误中学习(导致SLA违规)。我们的解决方案明显减少了平均SLA违规罚款 $ mathbf {2.7} 时代$ 和最多 $ mathbf {3.4} 时代$ 。它还显着提高了CPS的潜在节约,平均增长27.6%,高达43.6%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号