...
首页> 外文期刊>Future generation computer systems >Energy and performance improvements in stencil computations on multi-node HPC systems with different network and communication topologies
【24h】

Energy and performance improvements in stencil computations on multi-node HPC systems with different network and communication topologies

机译:具有不同网络和通信拓扑的多节点HPC系统模型计算的能量和性能改进

获取原文
获取原文并翻译 | 示例
           

摘要

Energy and performance improvements in stencil computations are relevant for both application developers and data center administrators. They appear as the fundamental scheme in many large-scale scientific simulations and workloads. Many research efforts have focused on some estimation techniques of the energy usage of HPC systems based on specific characteristics of parallel applications. In case of stencils, we have previously concentrated on detailed estimations of energy consumption and the energy-aware distribution of stencil computations on heterogeneous processors. However, we have restricted our comprehensive studies to a single heterogeneous computing node only. In this paper, we show how scheduling and optimization techniques can be applied for energy and performance improvements of stencil computations on multi-node HPC systems using different network topologies. We formulate a scheduling model together with a new Tabu Search algorithm, called Task Movement (TM), taking into account the communication hierarchies, to minimize the overall energy usage and the execution time of stencil computations. Experimental studies show that this algorithm solves the considered problem more efficiently comparing to other, simpler heuristics. We present computational experiments for a reference 7 point stencil computation pattern on three commonly used low-diameter network topologies: Fat-tree, Dragonfly, and Torus. According to our studies, the most promising multi-node HPC architecture for stencil computations is based on the Torus network concept. Finally, we argue that the proposed scheduling model and TM algorithm can be easily adopted within existing high-level parallel execution environments for stencils automatic performance tuning.
机译:模板计算中的能量和性能改进与应用程序开发人员和数据中心管理员都相关。它们在许多大规模的科学模拟和工作负载中显示为基本方案。许多研究努力专注于基于并行应用的特定特征的HPC系统能源使用的一些估算技术。在模板的情况下,我们之前已经集中在详细的能耗估计和异构处理器上的模板计算的能量感知分布。但是,我们仅限于仅限单个异构计算节点的综合研究。在本文中,我们展示了使用不同网络拓扑的多节点HPC系统上的模板计算的能量和性能改进的调度和优化技术。我们将调度模型与新的禁忌搜索算法一起制定,称为任务移动(TM),考虑到通信层次结构,以最小化模板计算的整体能量使用和执行时间。实验研究表明,该算法解决了与其他更简单的启发式更有效地比较所考虑的问题。我们在三个常用的低直径网络拓扑上呈现参考7点模板计算模式的计算实验:脂肪树,蜻蜓和圆环。根据我们的研究,用于模板计算的最有前途的多节点HPC架构基于Torus网络概念。最后,我们争辩说,在现有的高级并行执行环境中,可以轻松采用所提出的调度模型和TM算法,用于模板自动性能调谐。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号