首页> 外文期刊>Parallel Computing >Planning for performance: Enhancing achievable performance for MPI through persistent collective operations
【24h】

Planning for performance: Enhancing achievable performance for MPI through persistent collective operations

机译:绩效规划:通过持续的集体运营来提高MPI可实现的绩效

获取原文
获取原文并翻译 | 示例

摘要

Advantages of nonblocking collective communication in MPI have been established over the past quarter century, even predating MPI-1. For regular computations with fixed communication patterns, significant additional optimizations can be revealed through the use of persistence (planned transfers) not currently available in the MPI-3 API except for a limited form of point-to-point persistence (aka half-channels) standardized since MPI-1. This paper covers the design, prototype implementation of LibPNBC (based on LibNBC), and MPI-4 standardization status of persistent nonblocking collective operations. We provide early performance results, using a modified version of NBCBench and an example application (based on 3D conjugate gradient) illustrating the potential performance enhancements for such operations. Persistent operations enable MPI implementations to make intelligent choices about algorithm and resource utilization once and amortize this decision cost across many uses in a long-running program. Evidence that this approach is of value is provided. As with non-persistent, nonblocking collective operations, the requirement for strong progress and blocking completion notification are jointly needed to maximize the benefit of such operations (e.g., to support overlap of communication with computation and/or other communication). Further enhancement of the current reference implementation, as well as additional opportunities to enhance performance through the application of these new APIs, comprise future work. (C) 2018 Published by Elsevier B.V.
机译:在过去的四分之一世纪中,甚至在MPI-1之前,就已经在MPI中建立了无阻塞集体通信的优势。对于具有固定通信模式的常规计算,可以通过使用MPI-3 API中当前不存在的持久性(计划传输)来揭示重大的其他优化,除了点对点持久性(即半通道)的有限形式之外自MPI-1起标准化。本文涵盖LibPNBC(基于LibNBC)的设计,原型实现以及持久性非阻塞集体行动的MPI-4标准化状态。我们使用NBCBench的修改版和一个示例应用程序(基于3D共轭梯度)提供早期的性能结果,这些示例说明了此类操作的潜在性能增强。持久性操作使MPI实现可以一次智能地选择算法和资源利用率,并在长期运行的程序中分摊许多用途的决策成本。提供了这种方法有价值的证据。与非持久性,非阻塞性​​的集体操作一样,强烈需要进度和阻塞完成通知的要求,以最大化此类操作的收益(例如,支持通信与计算和/或其他通信的重叠)。当前参考实现的进一步增强,以及通过应用这些新API来提高性能的其他机会,都构成了未来的工作。 (C)2018由Elsevier B.V.发布

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号