High Performance MPI on IBM 12x InfiniBand Architecture

机译：IBM 12x InfiniBand体系结构上的高性能MPI

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

InfiniBand is becoming increasingly popular in the area of cluster computing due to its open standard and high performance. I/O interfaces like PCI-express and GX+ are being introduced as next generation technologies to drive InfiniBand with very high throughput. HCAs with throughput of 8x on PCI-express have become available. Recently, support for HCAs with 12x throughput on GX+ has been announced. In this paper, we design a message passing interface (MPI) on IBM 12x dual-port HCAs, which consist of multiple send/recv engines per port. We propose and study the impact of various communication scheduling policies (binding, striping and round robin). Based on this study, we present a new policy, EPC (enhanced point-to-point and collective), which incorporates different kinds of communication patterns; point-to-point (blocking, non-blocking) and collective communication, for data transfer. We implement our design and evaluate it with micro-benchmarks, collective communication and NAS parallel benchmarks. Using EPC on a 12x InfiniBand cluster with one HCA and one port, we can improve the performance by 41% with pingpong latency test and 63-65% with the unidirectional and bi-directional bandwidth tests, compared with the default single-rail MPI implementation. Our evaluation on NAS parallel benchmarks shows an improvement of 7-13% in execution time for integer sort and Fourier transform.

机译：由于其开放标准和高性能，InfiniBand在群集计算领域正变得越来越流行。 I / O接口（如PCI-express和GX +）作为下一代技术被引入，以极高的吞吐量驱动InfiniBand。在PCI-express上具有8倍吞吐量的HCA已面世。最近，已经宣布支持GX +上具有12倍吞吐量的HCA。在本文中，我们在IBM 12x双端口HCA上设计了一个消息传递接口（MPI），该接口由每个端口多个发送/接收引擎组成。我们提出并研究了各种通信调度策略（绑定，条带化和循环调度）的影响。在这项研究的基础上，我们提出了一种新的策略，EPC（增强的点对点和集体），它融合了各种通信模式。点对点（阻塞，非阻塞）和集体通信，用于数据传输。我们实施我们的设计，并通过微基准测试，集体通信和NAS并行基准进行评估。与默认的单轨MPI实施相比，在具有一个HCA和一个端口的12x InfiniBand群集上使用EPC，我们可以通过乒乓延迟测试将性能提高41％，通过单向和双向带宽测试将性能提高63-65％。。我们对NAS并行基准测试的评估表明，整数排序和傅里叶变换的执行时间缩短了7-13％。

著录项

来源
《》|2007年|1-8|共8页
会议地点
作者
Abhinav Vishnu; Benton; B.; Panda; D.K.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Fourier transforms; application program interfaces; computer architecture; message passing; peripheral interfaces; Fourier transform; HCA; IBM 12x InfiniBand architecture; PCI-express; application program interface; cluster computing; communication scheduling policy;

机译：傅里叶变换;应用程序接口;计算机体系结构;消息传递;外围接口; Fourier变换; HCA; IBM 12x InfiniBand架构; PCI-express;应用程序接口;集群计算;通信调度策略;

相似文献

外文文献
中文文献
专利

1. MPI applications' performances in native vs. virtualized environments using InfiniBand IPoIB virtualization and live migration [J] . Kobal Marko, Car Zlatan, Ojster?ek Milan Technical Gazette . 2015,第6期

机译：使用InfiniBand IPoIB虚拟化和实时迁移，MPI应用程序在本地和虚拟化环境中的性能
2. PERFORMANCE BENCHMARK AND MPI EVALUATION USING WESTMERE-BASED INFINIBAND HPC CLUSTER [J] . Basem Madani, Raed Al-Shaikh International journal of simulation: systems, science and technology . 2011,第1期

机译：使用基于Westernmerin的INFINIBAND HPC集群的性能基准和MPI评估
3. High performance RDMA-based MPI implementation over InfiniBand [J] . Liu JX, Wu JS, Panda DK International journal of parallel programming . 2004,第3期

机译：在InfiniBand上基于RDMA的高性能MPI实现
4. High Performance MPI on IBM 12x InfiniBand Architecture [C] . Abhinav Vishnu, Benton B., Panda D.K., IEEE International Parallel and Distributed Processing Symposium . 2007

机译：IBM 12x Infiniband架构的高性能MPI
5. Scalable and high-performance MPI design for very large InfiniBand clusters. [D] . Sur, Sayantan. 2007

机译：适用于非常大的InfiniBand群集的可扩展的高性能MPI设计。
6. High Performance Data Clustering: A Comparative Analysis of Performance for GPU RASC MPI and OpenMP Implementations [O] . Luobin Yang, Steve C. Chiu, Wei-Keng Liao, -1

机译：高性能数据集群：GPURASCMPI和OpenMP实现的性能比较分析
7. High performance checksum computation for fault-tolerant MPI over InfiniBand [O] . Re Denis, Francois Trahay, Yutaka Ishikawa 2015

机译：通过InfiniBand实现容错mpI的高性能校验和计算
8. Performance of an MPI-Only Semiconductor Device Simulator on a Quad Socket/Quad Cord InfiniBand Platform [R] . Lin, P. T., Shadid, J. N. 2009

机译：四插座/四线InfiniBand平台上仅mpI专用半导体器件模拟器的性能

High Performance MPI on IBM 12x InfiniBand Architecture

摘要

著录项

相似文献

相关主题

期刊订阅