首页> 外文学位 >High performance and scalable MPI intra-node communication middleware for multi-core clusters.

【24h】

High performance and scalable MPI intra-node communication middleware for multi-core clusters.

机译：用于多核集群的高性能和可扩展的MPI节点内通信中间件。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Cluster of workstations is one of the most popular architectures in high performance computing, thanks to its cost-to-performance effectiveness. As multi-core technologies are becoming mainstream, more and more clusters are deploying multicore processors as the build unit. In the latest Top500 supercomputer list published in November 2008, about 85% of the sites use multi-core processors from Intel and AMD. Message Passing Interface (MPI) is one of the most popular programming models for cluster computing. With increased deployment of multi-core systems in clusters, it is expected that considerable communication will take place within a node. This suggests that MPI intra-node communication is going to play a key role in the overall application performance.;This dissertation presents novel MPI intra-node communication designs, including user level shared memory based approach, kernel assisted direct copy approach, and efficient multi-core aware hybrid approach. The user level shared memory based approach is portable across operating systems and platforms. The processes copy messages into and from a shared memory area for communication. The shared buffers are organized in a way such that it is efficient in cache utilization and memory usage. The kernel assisted direct copy approach takes help from the operating system kernel and directly copies message from one process to another so that it only needs one copy and improves performance from the shared memory based approach. In this approach, the memory copy can be either CPU based or DMA based. This dissertation explores both directions and for DMA based memory copy, we take advantage of novel mechanism such as I/OAT to achieve better performance and computation and communication overlap. To optimize performance on multicore systems, we efficiently combine the shared memory approach and the kernel assisted direct copy approach and propose a topology-aware and skew-aware hybrid approach. The dissertation also presents comprehensive performance evaluation and analysis of the approaches on contemporary multi-core systems such as Intel Clovertown cluster and AMD Barcelona cluster, both of which are quad-core processors based systems.;Software developed as a part of this dissertation is available in MVAPICH and MVAPICH2, which are popular open-source implementations of MPI-1 and MPI-2 libraries over InfiniBand and other RDMA-enabled networks and are used by several hundred top computing sites all around the world.

机译：工作站集群是高性能计算中最受欢迎的体系结构之一，这要归功于它的性价比。随着多核技术成为主流，越来越多的集群正在部署多核处理器作为构建单元。在2008年11月发布的最新Top500超级计算机列表中，大约85％的站点使用Intel和AMD的多核处理器。消息传递接口（MPI）是用于群集计算的最受欢迎的编程模型之一。随着群集中多核系统部署的增加，预计在节点内将进行大量通信。这表明MPI节点间通信将在整个应用程序性能中发挥关键作用。论文提出了新颖的MPI节点间通信设计，包括基于用户级共享内存的方法，内核辅助直接复制方法以及高效的多点通信。核心感知的混合方法。基于用户级共享内存的方法可跨操作系统和平台移植。进程将消息复制到共享内存区域或从共享内存区域复制消息以进行通信。共享缓冲区的组织方式使其在缓存利用率和内存利用率方面均有效。内核辅助的直接复制方法从操作系统内核获得帮助，并将消息从一个进程直接复制到另一个进程，因此它仅需要一个副本，并通过基于共享内存的方法提高了性能。在这种方法中，内存副本可以基于CPU或基于DMA。本文探讨了基于DMA的存储器复制的两个方向，我们利用I / OAT等新颖的机制来实现更好的性能以及计算和通信重叠。为了优化多核系统的性能，我们将共享内存方法和内核辅助直接复制方法有效地结合在一起，并提出了一种拓扑感知和偏斜感知的混合方法。论文还对现代多核系统（如Intel Clovertown群集和AMD Barcelona群集）的方法进行了全面的性能评估和分析，它们都是基于四核处理器的系统。在MVAPICH和MVAPICH2中，它们是在InfiniBand和其他启用RDMA的网络上流行的MPI-1和MPI-2库的开源实现，并且在全世界数百个顶级计算站点中使用。

著录项

作者
Chai, Lei.;
展开▼
作者单位

The Ohio State University.;

展开▼
授予单位 The Ohio State University.;
学科 Computer Science.
学位 Ph.D.
年度 2009
页码 155 p.
总页数 155
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Designing truly one-sided MPI-2 RMA intra-node communication on multi-core systems [J] . Ping Lai, Sayantan Sur, Dhabaleswar K. Panda Computer science . 2010,第1a2期

机译：在多核系统上设计真正的单面MPI-2 RMA节点内通信
2. KNEM: A generic and scalable kernel-assisted intra-node MPI communication framework [J] . Brice Goglin, Stephanie Moreaud Journal of Parallel and Distributed Computing . 2013,第2期

机译：KNEM：通用且可扩展的内核辅助节点内MPI通信框架
3. Design of Scalable Java Communication Middleware for Multi-Core Systems [J] . Sabela Ramos, Guillermo L. Taboada, Roberto R. Exp6sito, The Computer journal . 2013,第2期

机译：用于多核系统的可扩展Java通信中间件设计
4. Lightweight Kernel-Level Primitives for High-Performance MPI Intra-Node Communication over Multi-Core Systems [C] . Hyun-Wook Jin, Sayantan Sur, Lei Chai, IEEE International Conference on Cluster Computing . 2007

机译：用于高性能MPI内部节点通信的轻量级内核基元在多核系统上
5. Scalable and high-performance MPI design for very large InfiniBand clusters. [D] . Sur, Sayantan. 2007

机译：适用于非常大的InfiniBand群集的可扩展的高性能MPI设计。
6. Learning-Directed Dynamic Voltage and Frequency Scaling Scheme with Adjustable Performance for Single-Core and Multi-Core Embedded and Mobile Systems [O] . Yen-Lin Chen, Ming-Feng Chang, Chao-Wei Yu, 2018

机译：具有学习性能的学习型动态电压和频率缩放方案适用于单核和多核嵌入式和移动系统
7. Kernel Assisted Collective Intra-node MPI Communication Among Multi-core and Many-core CPUs [O] . Ma, Teng, Bosilca, George, Bouteiller, Aurélien, 2011

机译：多核和多核CPU之间的内核辅助集体节点内MPI通信

High performance and scalable MPI intra-node communication middleware for multi-core clusters.

摘要

著录项

相似文献

相关主题

期刊订阅