首页> 外文OA文献 >Systemwide Power Management Targeting Early Hardware Overprovisioned High Performance Computers
【2h】

Systemwide Power Management Targeting Early Hardware Overprovisioned High Performance Computers

机译:针对早期硬件超额配置的高性能计算机的系统范围电源管理

摘要

High performance computing (HPC) systems are an important enabling tool for modern scientific discovery. These large scale computing systems have, since the 1990s, been increasing built as clusters of commodity computers. The operational energy needs of these clusters has lead the HPC community to focus on energy efficient hardware and programming practices. One of the major side effects of introducing energy efficient hardware is variability in power consumption between components within the cluster. In practice, power variability at scale has resulted in poor power utilization and challenges for energy providers contracted to provide the needed power. Hardware overprovisioned HPC systems have been proposed to improve power utilization however production deployment of such a system would compound the challenge for energy providers.This dissertation presents foundational work on HPC power scheduling, a technique that reduces the risks associated with operating hardware overprovisioned HPC systems. Power scheduling is formalized using the power scheduling invariant. Generalized application behavior, for applications running under a power cap, are experimentally studied. Study insights are used to develop a power scheduler and a power capping cluster simulator. Comparative behavior of different power scheduling strategies as also examined.Utilizing the power scheduling invariant, the safety of any power scheduler for deployment can be proven through analyzing scheduler's algorithm and mechanism. A general trend exists in power capped application performance that can be related to application progress, the underlying physics of the hardware, and expected runtime dilation. PowSched provides a proof by construction that power scheduling can be done safely and effectively without application specific models using a simple feedback mechanism. Experimentally, PowSched was shown to produce a 14% improvement in throughput compared to a fair distribution of power between cluster components. PowSim provides a proof by construction that the generalized effects on runtime can be efficiently simulated at scale, providing critical simulation infrastructure for researchers exploring power scheduling at scale. Using simulation, power scheduling strategies are studied and dynamic power scheduling appears to out perform static and reservation based techniques.This dissertation includes previously published and unpublished co-authored material.
机译:高性能计算(HPC)系统是现代科学发现的重要支持工具。自1990年代以来,这些大型计算系统已作为商品计算机的集群而不断增加。这些集群的运营能源需求已导致HPC社区专注于节能硬件和编程实践。引入高能效硬件的主要副作用之一是群集中组件之间功耗的可变性。在实践中,大规模的功率可变性导致功率利用不佳,并给签约提供所需功率的能源供应商带来挑战。已经提出了硬件超额配置的HPC系统来提高功率利用率,但是这种系统的生产部署将给能源提供商带来更大的挑战。本论文提出了HPC功率调度的基础工作,该技术可降低与操作硬件超额配置的HPC系统相关的风险。功率调度使用功率调度不变式形式化。实验研究了功率上限下运行的应用程序的通用应用程序行为。研究见解用于开发电源调度器和电源封顶集群模拟器。还研究了不同功率调度策略的比较行为。利用功率调度不变性,可以通过分析调度器的算法和机制来证明任何功率调度器的安全部署。功率上限的应用程序性能存在一个总体趋势,该趋势可能与应用程序进度,硬件的基本物理特性以及预期的运行时扩展有关。 PowSched通过构造提供了一个证明,即可以使用简单的反馈机制安全有效地完成功率调度,而无需使用特定于应用程序的模型。从实验上看,与集群组件之间的合理电源分配相比,PowSched的吞吐量提高了14%。 PowSim通过构造提供了一个证明,即可以有效地大规模模拟运行时的一般影响,从而为研究人员大规模探索电源调度提供了关键的仿真基础架构。通过仿真,研究了功率调度策略,动态功率调度似乎可以胜过基于静态和预留的技术。本文包括以前发表和未发表的合著材料。

著录项

  • 作者

    Ellsworth Daniel;

  • 作者单位
  • 年度 2017
  • 总页数
  • 原文格式 PDF
  • 正文语种 en_US
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号