首页> 外文会议>IEEE International Conference on Cluster Computing >Optimizing power allocation to CPU and memory subsystems in overprovisioned HPC systems
【24h】

Optimizing power allocation to CPU and memory subsystems in overprovisioned HPC systems

机译:在超额配置的HPC系统中优化对CPU和内存子系统的电源分配

获取原文
获取外文期刊封面目录资料

摘要

Energy consumption and power draw pose two major challenges to the HPC community for designing larger systems. Present day HPC systems consume as much as 10MW of electricity and this is fast becoming a bottleneck. Although energy bills will significantly increase with machine size, power consumption is a hard constraint that must be addressed. Intel's Running Average Power Limit (RAPL) toolkit is a recent feature that enables power capping of CPU and memory subsystems on modern hardware. In this paper, we use RAPL to evaluate the possibility of improving execution time efficiency of an application by capping power while adding more nodes. We profile the strong scaling of an application using different power caps for both CPU and memory subsystems. Our proposed interpolation scheme uses an application profile to optimize the number of nodes and the distribution of power between CPU and memory subsystems to minimize execution time under a strict power budget. We validate these estimates by running experiments on a 20-node (120 cores) Sandy Bridge cluster. Our experimental results closely match the model estimates and show speedups greater than 1.47X for all applications compared to not capping CPU and memory power. We demonstrate that the quality of solution that our interpolation scheme provides matches very closely to results obtained via exhaustive profiling.
机译:能耗和功耗对HPC社区在设计大型系统方面构成了两个主要挑战。如今,HPC系统消耗多达10兆瓦的电力,这正迅速成为瓶颈。尽管随着机器尺寸的增加,电费将大大增加,但是功耗是一个必须解决的硬约束。英特尔的运行平均功率限制(RAPL)工具包是一项最新功能,可对现代硬件上的CPU和内存子系统进行功率限额设置。在本文中,我们使用RAPL来评估在增加更多节点的同时限制功率来提高应用程序执行时间效率的可能性。我们介绍了针对CPU和内存子系统使用不同功率上限的应用程序的强大扩展能力。我们提出的插值方案使用应用程序配置文件来优化节点数量以及CPU和内存子系统之间的电源分配,以在严格的电源预算下最大程度地缩短执行时间。我们通过在20节点(120核)Sandy Bridge集群上运行实验来验证这些估计。我们的实验结果与模型估计值非常吻合,并且与未限制CPU和内存能力相比,所有应用程序的加速都超过1.47倍。我们证明了我们的插值方案提供的解决方案的质量与通过穷举分析获得的结果非常接近。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号