首页> 外文期刊>International Journal of High Performance Computing and Networking >Optimisation and performance evaluation of mechanisms for latency tolerance in remote memory access communication on clusters
【24h】

Optimisation and performance evaluation of mechanisms for latency tolerance in remote memory access communication on clusters

机译:群集上远程内存访问通信中的延迟容限机制的优化和性能评估

获取原文
获取原文并翻译 | 示例
           

摘要

This paper describes the design and performance evaluation of the mechanisms for latency tolerance in the remote memory access communication on clusters equipped with high-performance networks such as Myrinet. It discusses strategies that bridge the gap between user-level requirements and network-specific communication interfaces while attempting to increase opportunities for latency hiding. Mechanisms for overlapping communication with computation and coalescing small messages (trading latency for bandwidth) are explored. The effectiveness of these techniques is evaluated using microbenchmarks and application kernels including the NAS parallel benchmark suite. The microbenchmark results showed a much better degree of overlap for non-blocking operations in ARMCI when compared with MPI. Application results showed up to 30-45% improvement over MPI on using non-blocking operations. The aggregation of small messages yielded performance improvement of up to 78% over non-aggregated communication.
机译:本文介绍了在配备有高性能网络(如Myrinet)的群集上的远程内存访问通信中的延迟容限机制的设计和性能评估。它讨论了弥合用户级需求和特定于网络的通信接口之间的差距的策略,同时试图增加延迟隐藏的机会。探索了将通信与计算重叠并合并小消息(带宽的等待时间)的机制。使用微基准和应用程序内核(包括NAS并行基准测试套件)评估了这些技术的有效性。与MPI相比,微基准测试结果显示ARMCI中非阻塞操作的重叠程度要好得多。应用结果显示,与使用非阻塞操作相比,MPI最多可提高30-45%。小消息的聚合与未聚合的通信相比,性能提高了78%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号