首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Bridging the Gap Between OpenMP and Task-Based Runtime Systems for the Fast Multipole Method
【24h】

Bridging the Gap Between OpenMP and Task-Based Runtime Systems for the Fast Multipole Method

机译:快速多极方法在OpenMP和基于任务的运行时系统之间架起桥梁

获取原文
获取原文并翻译 | 示例

摘要

With the advent of complex modern architectures, the low-level paradigms long considered sufficient to build High Performance Computing (HPC) numerical codes have met their limits. Achieving efficiency, ensuring portability, while preserving programming tractability on such hardware prompted the HPC community to design new, higher level paradigms while relying on runtime systems to maintain performance. However, the common weakness of these projects is to deeply tie applications to specific expert-only runtime system APIs. The OpenMP specification, which aims at providing common parallel programming means for shared-memory platforms, appears as a good candidate to address this issue thanks to the latest task-based constructs introduced in its revision 4.0. The goal of this paper is to assess the effectiveness and limits of this support for designing a high-performance numerical library, ScalFMM, implementing the fast multipole method (FMM) that we have deeply re-designed with respect to the most advanced features provided by OpenMP 4. We show that OpenMP 4 allows for significant performance improvements over previous OpenMP revisions on recent multicore processors and that extensions to the 4.0 standard allow for strongly improving the performance, bridging the gap with the very high performance that was so far reserved to expert-only runtime system APIs.
机译:随着复杂的现代体系结构的出现,长期以来被认为足以构建高性能计算(HPC)数字代码的低级范例已达到其极限。在此类硬件上实现效率,确保可移植性并保持编程的可处理性,促使HPC社区设计新的更高级别的范例,同时依靠运行时系统来维持性能。但是,这些项目的共同弱点是将应用程序与特定于专家的特定运行时系统API紧密联系在一起。 OpenMP规范旨在为共享内存平台提供通用的并行编程方法,由于其版本4.0中引入了最新的基于任务的构造,因此它似乎是解决此问题的理想选择。本文的目的是评估设计高性能数值库ScalFMM,实施快速多极方法(FMM)的支持的有效性和局限性,该方法已经针对我们提供的最先进的功能进行了深入的重新设计。 OpenMP4。我们证明,OpenMP 4与以前在多核处理器上的OpenMP修订版相比,可以显着提高性能;对4.0标准的扩展可以极大地提高性能,从而弥补了迄今为止为专家保留的超高性能。 -仅运行时系统API。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号