Comparing Runtime Systems with Exascale Ambitions Using the Parallel Research Kernels

机译：使用并行研究内核将运行时系统与百亿美元的规模进行比较

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We use three Parallel Research Kernels to compare performance of a set of programming models(We employ the term programming model as it is commonly used in the application community. A more accurate term is programming environment, which is the collective of abstract programming model, embodiment of the model in an Application Programmer Interface (API), and the runtime that implements it.): MPI1 (MPI two-sided communication), MPIOPENMP (MPI+OpenMP), MPISHM (MPI1 with MPI-3 interprocess shared memory), MPIRMA (MPI one-sided communication), SHMEM, UPC, Charm++ and Grappa. The kernels in our study - Stencil, Synch_p2p and Transpose - underlie a wide range of computational science applications. They enable direct probing of properties of programming models, especially communication and synchronization. In contrast to mini- or proxy applications, the PRK allow for rapid implementation, measurement and verification. Our experimental results show MPISHM the overall winner, with MPI1, MPIOPENMP and SHMEM performing well. MPISHM and MPIOPENMP outperform the other models in the strong-scaling limit due to their effective use of shared memory and good granularity control. The non-evolutionary models Grappa and Charm++ are not competitive with traditional models (MPI and PGAS) for two of the kernels; these models favor irregular algorithms, while the PRK considered here are regular.

机译：我们使用三个并行研究内核来比较一组编程模型的性能（我们使用术语编程模型，因为它是应用程序社区中常用的术语。更准确的术语是编程环境，它是抽象编程模型，实施例的集合）应用程序程序员接口（API）中的模型及其实现的运行时。）：MPI1（MPI双向通信），MPIOPENMP（MPI + OpenMP），MPISHM（具有MPI-3进程间共享内存的MPI1），MPIRMA （MPI单面通信），SHMEM，UPC，Charm ++和Grappa。我们研究的内核-Stencil，Synch_p2p和Transpose-构成了广泛的计算科学应用程序的基础。它们可以直接探测编程模型的属性，尤其是通信和同步。与小型或代理应用程序相比，PRK允许快速实施，测量和验证。我们的实验结果表明MPISHM总体上是赢家，MPI1，MPIOPENMP和SHMEM表现良好。由于MPISHM和MPIOPENMP有效使用共享内存和良好的粒度控制，因此它们在强扩展限制方面优于其他模型。对于两个内核，非进化模型Grappa和Charm ++与传统模型（MPI和PGAS）没有竞争力。这些模型支持不规则算法，而此处考虑的PRK是规则的。

著录项

来源
《International conference on high performance computing》|2016年|321-339|共19页
会议地点
作者
Rob F. Van der Wijngaart; Abdullah Kayi; Jeff R. Hammond; Gabriele Jost; Tom St. John; Srinivas Sridharan; Timothy G. Mattson; John Abercrombie; Jacob Nelson;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Programming models; MPI; PGAS; Charm++;

机译：编程模型; MPI; PGAS;魅力++;

相似文献

外文文献
中文文献
专利

1. PROGRAMMING MODELS AT EXASCALE: ADAPTIVE RUNTIME SYSTEMS, INCOMPLETE SIMPLE LANGUAGES, AND INTEROPERABILITY [J] . Laxmikant Kale International Journal of High Performance Computing Applications . 2009,第4期

机译：大规模编程模型：自适应运行时系统，不完善的简单语言和互操作性
2. Enabling Hybrid Parallel Runtimes Through Kernel and Virtualization Support [J] . Kyle C. Hale, Peter A. Dinda ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2016,第7期

机译：通过内核和虚拟化支持启用混合并行运行时间
3. Optimizing Software Runtime Systems for Speculative Parallelization [J] . PARASKEVAS YIAPANIS, DEMIAN ROSAS-HAM, GAVIN BROWN, ACM Transactions on Architecture and Code Optimization . 2012,第4期

机译：优化软件运行时系统以进行推测性并行化
4. Comparing Runtime Systems with Exascale Ambitions Using the Parallel Research Kernels [C] . Rob F. Van der Wijngaart, Abdullah Kayi, Jeff R. Hammond, ISC High Performance Conference . 2016

机译：使用并行研究内核将运行时系统与Exascale野心进行比较
5. Incremental Parallelization of Existing Sequential Runtime Systems. [D] . Swaine, James. 2014

机译：现有顺序运行时系统的增量并行化。
6. Revisiting Molecular Dynamics on a CPU/GPU system: Water Kernel and SHAKE Parallelization [O] . A. Peter Ruymgaart, Ron Elber -1

机译：在CPU / GPU系统上重新定位分子动力学：水核和摇动并行化
7. A Case for Transforming Parallel Runtimes Into Operating System Kernels [O] . Kyle C. Hale, Peter A. Dinda 2015

机译：将并行运行时转换为操作系统内核的案例

Comparing Runtime Systems with Exascale Ambitions Using the Parallel Research Kernels

摘要

著录项

相似文献

相关主题

期刊订阅