
Topology-Aware Rank Reordering for MPI Collectives




As we move toward the Exascale era, HPC systems are becoming more complex, introducing increasing levels of heterogeneity in communication channels. This leads to variations in communication performance at different levels of hierarchy within modern HPC systems. Consequently, communicating peers such as MPI processes should be mapped onto the target cores in a topology-aware fashion so as to avoid message transmissions over slower channels. This is especially true for collective communications due to the global nature of their communication patterns and their vast use in many of parallel applications. In this paper, we exploit the rank reordering mechanism of MPI to realize run-time topology awareness for collective communications and in particular MPI_Allgather. To this end, we propose four fine-tuned mapping heuristics for various communication patterns and algorithms commonly used in MPI_Allgather. The heuristics provide a better match between the collective communication pattern and the topology of the target system. Our experimental results with 4096 processes show that MPI rank reordering using the proposed fine-tuned mapping heuristics can provide up to 78% reduction in MPI_Allgather latency at the micro-benchmark level. At the application level, we can achieve up to 34% reduction in execution time. The results also show that the proposed heuristics significantly outperform the Scotch library which provides a general-purpose graph mapping library.



  • 外文文献
  • 中文文献
  • 专利


京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号