首页> 外文期刊>Concurrency, practice and experience >An evaluation of MPI and OpenMP paradigms in finite-difference explicit methods for PDEs on shared-memory multi- and manycore systems
【24h】

An evaluation of MPI and OpenMP paradigms in finite-difference explicit methods for PDEs on shared-memory multi- and manycore systems

机译:对共享内存多功能系统PDE有限差分明显方法的MPI和OpenMP范例的评估

获取原文
获取原文并翻译 | 示例

摘要

This paper focuses on parallel implementations of three two-dimensional explicit numerical methods on Intel (R) Xeon (R) Scalable Processor and the coprocessor Knights Landing. In this study, the performance of a hybrid parallel programming with message passing interface (MPI) and Open Multi-Processing (OpenMP) and a pure MPI implementation used with two thread binding policies is compared with an improved OpenMP-based implementation in three explicit finite-difference methods for solving partial differential equations on shared-memory multicore and manycore systems. Specifically, the improved OpenMP-based version is a strategy that synchronizes adjacent threads and eliminates the implicit barriers of a naive OpenMP-based implementation. The experiments show that the most suitable approach depends on several characteristics related to the nonuniform memory access (NUMA) effect and load balancing, such as the size of the MPI domain and the number of synchronization points used in the parallel implementation. In algorithms that use four and five synchronization points, hybrid MPI/OpenMP approaches yielded better speedups than the other versions did in runs performed on both systems. The pure MPI-based strategy, however, achieved better results than the other proposed approaches did in the method that employs only one synchronization point.
机译:本文侧重于英特尔(R)Xeon(R)可伸缩处理器和协处理器骑士登陆的三维二维明确数值的并行实现。在这项研究中,将混合并行编程与消息传递接口(MPI)和开放式多处理(OpenMP)的性能进行比较,与两个线程绑定策略一起使用的纯MPI实现,并在三个明确的有限内改进了基于OpenMP的实现 - 求解共享内存多核和多核系统的局部微分方程的辅助方法。具体而言,改进的基于OpenMP的版本是同步相邻线程的策略,并消除基于Naive OpenMP的实现的隐式障碍。实验表明,最合适的方法取决于与不均匀的内存访问(NUMA)效应和负载平衡相关的若干特性,例如MPI域的大小和并行实现中使用的同步点的数量。在使用四个和五个同步点的算法中,混合MPI / OpenMP方法产生了比在两个系统上执行的运行中所做的其他版本更好的加速。然而,基于MPI的策略达到了比在仅使用一个同步点的方法中所做的其他方法所做的更好的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号