首页> 外文OA文献 >Rubik: fast analytical power management for latency-critical systems
【2h】

Rubik: fast analytical power management for latency-critical systems

机译:Rubik:针对延迟关键系统的快速分析电源管理

摘要

Latency-critical workloads (e.g., web search), common in datacenters, require stable tail (e.g., 95th percentile) latencies of a few milliseconds. Servers running these workloads are kept lightly loaded to meet these stringent latency targets. This low utilization wastes billions of dollars in energy and equipment annually.Applying dynamic power management to latency-critical workloads is challenging. The fundamental issue is coping with their inherent short-term variability: requests arrive at unpredictable times and have variable lengths. Without knowledge of the future, prior techniques either adapt slowly and conservatively or rely on application-specific heuristics to maintain tail latency.We propose Rubik, a fine-grain DVFS scheme for latency-critical workloads. Rubik copes with variability through a novel, general, and efficient statistical performance model. This model allows Rubik to adjust frequencies at sub-millisecond granularity to save power while meeting the target tail latency. Rubik saves up to 66% of core power, widely outperforms prior techniques, and requires no application-specific tuning.Beyond saving core power, Rubik robustly adapts to sudden changes in load and system performance. We use this capability to design RubikColoc, a colocation scheme that uses Rubik to allow batch and latency-critical work to share hardware resources more aggressively than prior techniques. RubikColoc reduces datacenter power by up to 31% while using 41% fewer servers than a datacenter that segregates latency-critical and batch work, and achieves 100% core utilization.
机译:数据中心中常见的对延迟至关重要的工作负载(例如Web搜索)需要几毫秒的稳定尾部延迟(例如95%百分数)。轻负载运行这些工作负载的服务器即可满足这些严格的延迟目标。这种低利用率每年浪费数十亿美元的能源和设备。将动态电源管理应用于对延迟至关重要的工作负载是一项挑战。根本问题在于应对其固有的短期可变性:请求到达的时间无法预测,并且长度可变。在不了解未来的情况下,现有技术要么缓慢缓慢地适应,要么依靠特定于应用程序的启发式方法来维持尾部延迟。我们提出了Rubik,一种针对延迟关键型工作负载的细粒度DVFS方案。 Rubik通过新颖,通用和有效的统计绩效模型来应对变化。该模型允许Rubik以亚毫秒级的粒度调整频率,以节省功率,同时满足目标尾部等待时间。 Rubik可以节省多达66%的内核功率,大大优于现有技术,并且不需要特定于应用程序的调整。Rubik除了可以节省内核功率之外,还可以强大地适应负载和系统性能的突然变化。我们使用此功能来设计RubikColoc,这是一种托管方案,该方案使用Rubik允许批处理和对延迟至关重要的工作比现有技术更积极地共享硬件资源。与分离关键延迟和批处理工作并实现100%核心利用率的数据中心相比,RubikColoc最多可将数据中心的功耗降低31%,同时使用的服务器减少41%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号