首页> 外文会议>2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum >Optimizing MPI Collectives Using Efficient Intra-node Communication Techniques over the Blue Gene/P Supercomputer
【24h】

Optimizing MPI Collectives Using Efficient Intra-node Communication Techniques over the Blue Gene/P Supercomputer

机译:在Blue Gene / P超级计算机上使用高效的节点内通信技术优化MPI集合体

获取原文

摘要

The Blue Gene/P (BG/P) supercomputer consists of thousands of compute nodes interconnected by multiple networks. Out of these, a 3D torus equipped with direct memory access (DMA) engine is the primary network. BG/P also features a collective network which supports hardware accelerated collective operations such as broadcast and all reduce. One of the operating modes on BG/P is the virtual node mode where the four cores can be active MPI tasks, performing inter-node and intra-node communication. This paper proposes software techniques to enhance MPI Collective communication primitives, MPI Bcast and MPI Allreduce in virtual node mode by using cache coherent memory subsystem as the communication method within the node. The paper describes techniques leveraging atomic operations to design concurrent data structures such as broadcast-FIFOs to enable efficient collectives. Such mechanisms are important as we expect the core counts to rise in the future and having such data structures makes programming easier and efficient. We also demonstrate the utility of shared address space techniques for MPI collectives, wherein a process can access the peer''s memory by specialized system calls. Apart from cutting down the copy costs, such techniques allow for seamless integration of network protocols with intra-node communication methods. We propose intra-node extensions to multi-color network algorithms for collectives using light weight synchronizing structures and atomic operations. Further, we demonstrate that shared address techniques allow for good load balancing and are critical for efficiently using the hardware collective network on BG/P. When compared to current approaches on the 3D torus, our optimizations provide performance up to almost 3 folds for MPI Bcast and a 33% performance gain for MPI Allreduce(in virtual node mode). We also see improvements up to 44% for MPI Bcast using the collective tree network.
机译:蓝色基因/ P(BG / P)超级计算机由成千上万个通过多个网络互连的计算节点组成。其中,配备有直接内存访问(DMA)引擎的3D圆环是主要网络。 BG / P还具有一个集体网络,该网络支持硬件加速的集体操作,例如广播,并且全部减少。 BG / P上的一种操作模式是虚拟节点模式,其中四个核心可以充当活动的MPI任务,执行节点间和节点内通信。本文提出了通过使用缓存相干内存子系统作为节点内的通信方法来增强虚拟节点模式下的MPI集体通信原语,MPI Bcast和MPI Allreduce的软件技术。本文介绍了利用原子操作来设计并发数据结构(例如广播FIFO)以实现有效集合的技术。这些机制非常重要,因为我们希望将来核心数量会增加,并且拥有这样的数据结构将使编程更加容易和高效。我们还演示了MPI集合的共享地址空间技术的实用性,其中一个进程可以通过专门的系统调用来访问对等方的内存。除了减少复制成本之外,这种技术还允许将网络协议与节点内通信方法无缝集成。我们建议使用轻量级同步结构和原子操作对集合体的多色网络算法进行节点内扩展。此外,我们证明了共享地址技术可以实现良好的负载平衡,并且对于有效使用BG / P上的硬件集合网络至关重要。与3D圆环上的当前方法相比,我们的优化为MPI Bcast提供了近3倍的性能,为MPI Allreduce(在虚拟节点模式下)提供了33%的性能提升。我们还看到使用集体树网络对MPI Bcast的改进高达44%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号