首页> 外文期刊>Parallel Computing >QTMS: A quadratic time complexity topology-aware process mapping method for large-scale parallel applications on shared HPC system
【24h】

QTMS: A quadratic time complexity topology-aware process mapping method for large-scale parallel applications on shared HPC system

机译:QTMS:一种二次时间复杂性拓扑信息,用于共享HPC系统的大规模并行应用的过程映射方法

获取原文
获取原文并翻译 | 示例

摘要

Communication exacerbates the performance for parallel applications with thousands of CPU cores and quantities of data to exchange. The high communication cost is usually attributed to the mismatch between the communication patterns of parallel applications and the physical topology graphs of the computing resources (or the underlying network topologies). The topology-aware process mapping method can usually obtain a better embedding scheme with the aim to improve communication performance. Many existing heuristic-search based mapping methods have high execution time for large-scale applications. Some low-cost graph-partitioning based mapping methods depend on that the allocated resources form a regular structure, which is usually impractical in most high performance computing systems shared by multiple users and applications. This weakens their performance. Other graph-partitioning based mapping methods come at a high cost or require users to provide the network structure information. To address these issues, a quadratic time complexity topology-aware process mapping method is presented in this paper. The experimental results show that the proposed method often achieves a better application communication performance than several state-of-the-art mapping methods on a shared HPC system, while maintaining a significantly lower execution cost. Moreover, the real-world scientific application proxies gain an execution time reduction as large as 14.60% in the 512 process-scale compared to the system default process placement on the TianHe-2 HPC systems. (C) 2020 Elsevier B.V. All rights reserved.
机译:通信加剧了具有数千个CPU核心和要交换的数据量的并行应用程序的性能。高通信成本通常归因于并行应用的通信模式与计算资源(或底层网络拓扑)的物理拓扑图之间的不匹配。拓扑知识的过程映射方法通常可以获得更好的嵌入方案,其目的是提高通信性能。许多现有的启发式搜索基于的映射方法具有高执行时间的大规模应用程序。基于低成本的图形分区的映射方法取决于分配的资源形成常规结构,这在多个用户和应用程序共享的大多数高性能计算系统中通常是不切实际的。这削弱了他们的表现。基于图形划分的映射方法以高成本或要求用户提供网络结构信息。为了解决这些问题,本文提出了一种二次时间复杂性拓扑的过程映射方法。实验结果表明,该方法通常在共享HPC系统上实现比若干现实映射方法更好的应用程序通信性能,同时保持显着降低的执行成本。此外,与天河-2 HPC系统上的系统默认过程放置相比,实际科学应用程序代理在512过程中,在512过程中,在512过程中,在512过程中的执行时间减少为14.60%。 (c)2020 Elsevier B.v.保留所有权利。

著录项

  • 来源
    《Parallel Computing》 |2020年第6期|102637.1-102637.13|共13页
  • 作者单位

    Beihang Univ Sch Comp Sci & Engn Beijing 100191 Peoples R China;

    Beihang Univ Sch Comp Sci & Engn Beijing 100191 Peoples R China;

    Beijing Union Univ Smart City Coll Beijing 100101 Peoples R China;

    Inst Appl Phys & Computat Math 2 East Fenghao Rd Beijing 100094 Peoples R China;

    Lawrence Berkeley Natl Lab One Cyclotron Rd Berkeley CA 94720 USA;

    Beihang Univ Sch Comp Sci & Engn Beijing 100191 Peoples R China;

    Beihang Univ Sch Comp Sci & Engn Beijing 100191 Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Topology-aware process mapping; Communication optimization; Shared HPC system;

    机译:拓扑感知过程映射;通信优化;共享HPC系统;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号