首页> 外文会议>Recent advances in the message passing interface >A Scalable MPI_Comm_split Algorithm for Exascale Computing
【24h】

A Scalable MPI_Comm_split Algorithm for Exascale Computing

机译:用于Exascale计算的可扩展MPI_Comm_split算法

获取原文
获取原文并翻译 | 示例

摘要

Existing algorithms for creating communicators in MPI programs will not scale well to future exascale supercomputers containing millions of cores. In this work, we present a novel communicator-creation algorithm that does scale well into millions of processes using three techniques: replacing the sorting at the end of MPI_Comm_split with merging as the color and key table is built, sorting the color and key table in parallel, and using a distributed table to store the output communicator data rather than a replicated table. This reduces the time cost of MPI_Comm_split in the worst case we consider from 22 seconds to 0.37 second. Existing algorithms build a table with as many entries as processes, using vast amounts of memory. Our algorithm uses a small, fixed amount of memory per communicator after MPI_Comm_split has finished and uses a fraction of the memory used by the conventional algorithm for temporary storage during the execution of MPI_Comm_split.
机译:用于在MPI程序中创建通信器的现有算法不能很好地扩展到包含数百万个内核的未来的亿亿级超级计算机。在这项工作中,我们提出了一种新颖的通信程序创建算法,该算法可以使用以下三种技术很好地扩展到数百万个进程中:在构建颜色和键表时将MPI_Comm_split末尾的排序替换为合并,在其中对颜色和键表进行排序并行,并使用分布式表存储输出通信器数据,而不是复制表。在我们认为最坏的情况下,这会将MPI_Comm_split的时间成本从22秒减少到0.37秒。现有算法使用大量内存来构建一个表,该表具有与进程一样多的条目。在MPI_Comm_split完成后,我们的算法在每个通信器上使用少量固定的内存,并在执行MPI_Comm_split期间使用常规算法使用的一部分内存进行临时存储。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号