Design and Characterization of Shared Address Space MPI Collectives on Modern Architectures

机译：现代体系结构上共享地址空间MPI集合的设计和表征

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Emerging multi-/many-cores such as Intel Xeon and Xeon Phi are widely being adopted for modern large-scale supercomputing systems. The architectural features such as high core density, mesh interconnects, deeper memory hierarchies and hardware multi-threading offered by these systems provide opportunities for application developers to exploit more parallelism. However, it also poses significant challenges for the MPI runtimes to optimize communication performance. One of the major challenges involves optimizing collective communication for dense multi-/many-core processors. Traditionally, MPI runtimes have used send/recv, direct shared-memory ("double-copy") or kernel-assisted ("single-copy") mechanisms for intra-node collective communication. However, existing collective designs that are based on these mechanisms suffer from several bottlenecks such as multiple copies, per message handshake, and kernel-level lock contention that limit their performance. In this paper, we first characterize the bottlenecks associated with the aforementioned approaches in designing collectives in MPI. Then, we propose efficient "Shared-address space"-based designs to implement different MPI collectives. Finally, we show the efficacy of our approach by implementing various MPI collectives. Our proposed designs show up to 11x, 50x, 17x, and 5x performance improvement for Bcast, Scatter, Gather, and Alltoall over other state-of-the-art MPI libraries on different multi-/many-core architectures.

机译：诸如英特尔至强和至强融核等新兴的多核/多核被现代大规模超级计算系统广泛采用。这些系统提供的架构功能（例如，高内核密度，网格互连，更深的内存层次结构和硬件多线程）为应用程序开发人员提供了更多利用并行性的机会。但是，这也给MPI运行时优化通信性能提出了严峻的挑战。主要挑战之一涉及为密集的多核/多核处理器优化集体通信。传统上，MPI运行时使用发送/接收，直接共享内存（“双副本”）或内核辅助（“单副本”）机制进行节点内集体通信。但是，基于这些机制的现有集合设计存在多个瓶颈，例如多个副本，每个消息握手以及限制它们性能的内核级锁争用。在本文中，我们首先描述了在MPI中设计集合体时与上述方法相关的瓶颈。然后，我们提出了有效的基于“共享地址空间”的设计，以实现不同的MPI集合。最后，我们通过实施各种MPI集合展示了我们方法的有效性。我们建议的设计相对于不同的多核/多核体系结构上的其他最新MPI库，对Bcast，Scatter，Gather和Alltoall的性能提高了11倍，50倍，17倍和5倍。

著录项

来源
《IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing》|2019年|410-419|共10页
会议地点
作者
Jahanzeb Maqbool Hashmi; Sourav Chakraborty; Mohammadreza Bayatpour; Hari Subramoni; Dhabaleswar K. Panda;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
application program interfaces; message passing; microprocessor chips; multiprocessing systems; multi-threading; parallel machines; shared memory systems;

机译：应用程序接口;消息传递;微处理器芯片;多处理系统;多线程;并行机;共享内存系统;

相似文献

外文文献
中文文献
专利

1. Improved MPI collectives for MPI processes in shared address spaces [J] . Li Shigang, Hoefler Torsten, Hu Chungjin, Cluster computing . 2014,第4期

机译：改进的MPI集合，用于共享地址空间中的MPI流程
2. A Comparison of MPI, SHMEM and Cache - Coherent Shared address Space Programming Models on a Tightly-Coupled Multiprocessors [J] . Hongzhang Shan, Jaswinder Pal Singh International journal of parallel programming . 2001,第3期

机译：MPI，SHMEM和高速缓存的比较-紧密耦合多处理器上的相干共享地址空间编程模型
3. Parallelizing a GIS on a shared address space architecture [J] . Shekhar S., Ravada S. Computer . 1996,第12期

机译：在共享地址空间架构上并行化GIS
4. Design and Characterization of Shared Address Space MPI Collectives on Modern Architectures [C] . Jahanzeb Maqbool Hashmi, Sourav Chakraborty, Mohammadreza Bayatpour, IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing . 2019

机译：现代架构共享地址空间MPI集体的设计与特征
5. High-Performance Communication in MPI through Message Matching and Neighborhood Collective Design [D] . Ghazimirsaeed, Seyedeh Mahdieh 2019

机译：通过消息匹配和邻域集体设计实现MPI中的高性能通信
6. Exploration of a Capability-Focused Aerospace System of Systems Architecture Alternative with Bilayer Design Space Based on RST-SOM Algorithmic Methods [O] . Zhifei Li, Dongliang Qin, Feng Yang -1

机译：基于RST-SOM算法的以双层设计空间为中心的以系统架构替代能力为重点的航空航天系统的探索
7. A Comparison of MPI, SHMEM and Cache-coherent Shared Address Space Programming Models on the SGI Origin2000 [O] . Hongzhang Shan, Jaswinder Pal Singh 1999

机译：SGI Origin2000上的MPI，SHMEM和高速缓存一致性共享地址空间编程模型的比较
8. MPI Support for Multi-Core Architectures: Optimized Shared Memory Collectives. [R] . Graham, R. L., Shipman, G. 2013

机译：mpI支持多核架构：优化共享内存集合。

Design and Characterization of Shared Address Space MPI Collectives on Modern Architectures

摘要

著录项

相似文献

相关主题

期刊订阅