【24h】

MPI+ULT: Overlapping Communication and Computation with User-Level Threads

机译:MPI + ULT:与用户级线程重叠的通信和计算

获取原文
获取原文并翻译 | 示例

摘要

As the core density of future processors keeps increasing, MPI+Threads is becoming a promising programming model for large scale SMP clusters. Generally speaking, hybrid MPI+Threads runtime can largely improve intra-node parallelism and data sharing on shared-memory architectures. However, it does not help much on inter-node communication due to the inefficient integration of existing communication and threading libraries. More specifically, existing MPI+Threads runtime systems use coarse-grained locks to protect their thread safety, which leads to heavy lock contention and limit the scalability of the runtime. While kernel threads are efficient for intra-node parallelism, we found that they are too heavy for computation/communication overlap in an MPI+Threads runtime system. In this paper we propose a new way for asynchronous MPI communication with user-level threads (MPI+ULT). By enabling ULT context switching inside MPI, MPI communication in one ULT can overlap with computation or communication in other ULTs. MPI+ULT can be used for communication hiding in various scenarios, including MPI point-to-point, collective and one-sided calls. We use MPI+ULT in two applications, a high-performance conjugate gradient benchmark and a genome assembly application, to show how MPI+ULT can help effectively hide communication and reduce runtime overhead. Experiments show that our method helps improve the performance of these applications significantly.
机译:随着未来处理器的核心密度不断提高,MPI + Threads正成为用于大型SMP集群的有前途的编程模型。一般来说,MPI + Threads混合运行时可以在共享内存体系结构上极大地改善节点内并行性和数据共享。但是,由于现有通信和线程库的集成效率低下,它对节点间通信没有太大帮助。更具体地说,现有的MPI + Threads运行时系统使用粗粒度锁来保护其线程安全,这导致大量锁争用并限制了运行时的可伸缩性。尽管内核线程对于节点内并行性非常有效,但我们发现它们对于MPI + Threads运行时系统中的计算/通信重叠而言过于繁重。在本文中,我们提出了一种与用户级线程(MPI + ULT)进行异步MPI通信的新方法。通过在MPI内部启用ULT上下文切换,一个ULT中的MPI通信可以与其他ULT中的计算或通信重叠。 MPI + ULT可用于各种情况下的通信隐藏,包括MPI点对点,集体呼叫和单方呼叫。我们在两个应用程序(高性能共轭梯度基准测试和基因组装配应用程序)中使用MPI + ULT,以显示MPI + ULT如何帮助有效隐藏通信并减少运行时开销。实验表明,我们的方法有助于显着提高这些应用程序的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号