首页> 外文会议> >System-Level, Unified In-band and Out-of-band Dynamic Thermal Control
【24h】

System-Level, Unified In-band and Out-of-band Dynamic Thermal Control

机译:系统级,统一的带内和带外动态热控制

获取原文

摘要

High-density computer racks become increasingly commonplace in supercomputing centers and data centers. With tight integration of high-powered computing components in the racks, hot spots or pockets of elevated temperatures on the chips and system can be easily formed when room air circulation is not effective. Hot spots reduce the reliability of high-density systems and increase the chances of thermal emergencies, which further trigger system slowdowns or shutdowns. Techniques such as dynamically scaling down the voltage of the CPUs and fan control are available on todayȁ9;s systems to reduce heat generation and dissipate heat. Unfortunately, these techniques work independently on their own without cooperation. As a result, to prevent thermal emergencies, systems may work at reduced capacity when full capacity is required. We propose a combined in-band and out-of-band approach to reduce the likelihood of thermal emergency slowdowns and improve the reliability of systems. Our thermal control framework unifies temperature control mechanisms in systems to balance temperature, power consumption, and performance. More precisely, we balance the use of in-band dynamic voltage and frequency scaling (DVFS) with out-of-band proactive fan control. Our results on a power-aware cluster indicate the coordinated use of fan control and DVFS is more effective than either technique in isolation at reducing average system operating temperatures with expected performance.
机译:高密度计算机机架在超级计算中心和数据中心中变得越来越普遍。通过将高功率计算组件紧密集成在机架中,当室内空气流通无效时,很容易在芯片和系统上形成热点或高温小袋。热点降低了高密度系统的可靠性,并增加了发生紧急事件的可能性,这进一步触发了系统的减速或关闭。在今天的9系统上,可以采用动态降低CPU电压和风扇控制等技术来减少热量的产生和散发热量。不幸的是,这些技术无需合作即可独立工作。结果,为防止热紧急情况,当需要全容量时,系统可以在降低的容量下工作。我们提出了一种带内和带外组合的方法,以减少热紧急情况变慢的可能性并提高系统的可靠性。我们的热控制框架统一了系统中的温度控制机制,以平衡温度,功耗和性能。更准确地说,我们在带内动态电压和频率缩放(DVFS)与带外主动风扇控制之间取得平衡。我们在能感知功率的群集上的结果表明,风扇控制和DVFS的协调使用比任何一种技术都更有效,在降低平均系统运行温度和预期性能方面比隔离技术更有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号