首页> 外文学位 >Improving message-passing performance and scalability in high-performance clusters.
【24h】

Improving message-passing performance and scalability in high-performance clusters.

机译:在高性能群集中提高消息传递性能和可伸缩性。

获取原文
获取原文并翻译 | 示例

摘要

High Performance Computing (HPC) is the key to solving many scientific, financial, and engineering problems. Computer clusters are now the dominant architecture for HPC. The scale of clusters, both in terms of processor per node and the number of nodes, is increasing rapidly, reaching petascales these days and soon to exascales. Inter-process communication plays a significant role in the overall performance of HPC applications. With the continuous enhancements in interconnection technologies and node architectures, the Message Passing Interface (MPI) needs to be improved to effectively utilize the modern technologies for higher performance.;To improve communication progress and overlap in large message transfers, a method is proposed which uses speculative communication to overlap communication with computation in the MPI Rendezvous protocol. The results show up to 100% communication progress and more than 80% overlap ability over iWARP Ethernet. An adaptation mechanism is employed to avoid overhead on applications that do not benefit from the method due to their timing specifications.;To reduce MPI communication latency, I have proposed a technique that exploits the application buffer reuse characteristics for small messages and eliminates the sender-side copy in both two-sided and one-sided MPI small message transfer protocols. The implementation over InfiniBand improves small message latency up to 20%. The implementation adaptively falls back to the current method if the application does not benefit from the proposed technique.;Finally, to improve scalability of MPI applications on ultra-scale clusters, I have proposed an extension to the current iWARP standard. The extension improves performance and memory usage for large-scale clusters. The extension equips Ethernet with an efficient zero-copy, connection-less datagram transport. The software-level evaluation shows more than 40% performance benefits and 30% memory usage reduction for MPI applications on a 64-core cluster.;After providing a background, I present a deep analysis of the user level and MPI libraries over modern cluster interconnects: InfiniBand, iWARP Ethernet, and Myrinet. Using novel techniques, I assess characteristics such as overlap and communication progress ability, buffer reuse effect on latency, and multiple-connection scalability. The outcome highlights some of the inefficiencies that exist in the communication libraries.
机译:高性能计算(HPC)是解决许多科学,财务和工程问题的关键。现在,计算机群集是HPC的主要体系结构。集群的规模,无论是每个节点的处理器数量还是节点数量,都在迅速增长,如今已达到PB级,不久便达到了EB级。进程间通信在HPC应用程序的整体性能中起着重要作用。随着互连技术和节点体系结构的不断增强,需要改进消息传递接口(Message Passing Interface,MPI)以有效利用现代技术来实现更高的性能。为了提高通信进度和大消息传输中的重叠,提出了一种使用在MPI Rendezvous协议中,将推测性通信与计算中的通信重叠。结果表明,与iWARP以太网相比,通信进度高达100%,重叠能力超过80%。采用了一种自适应机制来避免由于其时序规范而无法从该方法中受益的应用程序的开销。为了减少MPI通信延迟,我提出了一种技术,该技术利用了小消息的应用程序缓冲区重用特性并消除了发送方-双面和双面MPI小消息传输协议中的侧面复制。通过InfiniBand实施可将小消息延迟提高多达20%。如果应用程序无法从所提出的技术中受益,则实现方式将自适应地退回到当前方法。最后,为了提高MPI应用程序在超大规模集群上的可伸缩性,我提出了对当前iWARP标准的扩展。该扩展提高了大型集群的性能和内存使用率。该扩展为以太网提供了高效的零拷贝,无连接数据报传输。软件级别的评估显示,对于64核群集上的MPI应用程序,性能提高了40%以上,内存使用减少了30%。;提供了背景知识之后,我对现代群集互连上的用户级别和MPI库进行了深入分析。 :InfiniBand,iWARP以太网和Myrinet。我使用新颖的技术来评估诸如重叠和通信进度能力,缓冲区重用对延迟的影响以及多连接可伸缩性等特征。结果突出了通信库中存在的一些低效率问题。

著录项

  • 作者

    Rashti, Mohammad Javad.;

  • 作者单位

    Queen's University (Canada).;

  • 授予单位 Queen's University (Canada).;
  • 学科 Engineering Electronics and Electrical.
  • 学位 Ph.D.
  • 年度 2011
  • 页码 184 p.
  • 总页数 184
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号