首页> 外文期刊>International journal of parallel programming >Source-to-Source Parallelization Compilers for Scientific Shared-Memory Multi-core and Accelerated Multiprocessing: Analysis, Pitfalls, Enhancement and Potential
【24h】

Source-to-Source Parallelization Compilers for Scientific Shared-Memory Multi-core and Accelerated Multiprocessing: Analysis, Pitfalls, Enhancement and Potential

机译:用于科学共享内存多核和加速多处理的源到源并行编译器:分析,陷阱,增强功能和潜力

获取原文
获取原文并翻译 | 示例

摘要

Parallelization schemes are essential in order to exploit the full benefits of multi-core architectures, which have become widespread in recent years, especially for scientific applications. In shared memory architectures, the most common parallelization API is OpenMP. However, the introduction of correct and optimal OpenMP parallelization to applications is not always a simple task, due to common parallel shared memory management pitfalls and architecture heterogeneity. To ease this process, many automatic parallelization compilers were created. In this paper we focus on three source-to-source compilers-AutoPar, Par4All and Cetus-which were found to be most suitable for the task, point out their strengths and weaknesses, analyze their performances, inspect their capabilities and suggest new paths for enhancement. We analyze and compare the compilers' performances over several different exemplary test cases, with each test case pointing out different pitfalls, and suggest several new ways to overcome these pitfalls, while yielding excellent results in practice. Moreover, we note that all of those source-to-source parallelization compilers function in the limits of OpenMP 2.5-an outdated version of the API which is no longer in optimal accordance with nowadays complicated heterogeneous architectures. Therefore we suggest a path to exploit the new features of OpenMP 4.5, as it provides new directives to fully utilize heterogeneous architectures, specifically ones that have a strong collaboration between CPUs and GPGPUs, thus it outperforms previous results by an order of magnitude.
机译:并行化方案对于充分利用多核体系结构的全部优势至关重要,近年来,这种多核体系结构已变得越来越普遍,尤其是对于科学应用而言。在共享内存体系结构中,最常见的并行化API是OpenMP。但是,由于常见的并行共享内存管理陷阱和体系结构异质性,向应用程序引入正确和最佳的OpenMP并行化并不总是一件容易的事。为了简化此过程,创建了许多自动并行化编译器。在本文中,我们重点介绍三个最适合执行任务的源到源编译器-AutoPar,Par4All和Cetus,指出它们的优缺点,分析它们的性能,检查它们的功能并提出新的开发途径增强。我们在几个不同的示例性测试案例上分析和比较了编译器的性能,每个测试案例都指出了不同的陷阱,并提出了几种新的方法来克服这些陷阱,同时在实践中产生出色的结果。此外,我们注意到所有这些源到源并行化编译器都在OpenMP 2.5(API的过时版本)的局限内发挥作用,该版本已不再是当今复杂的异构体系结构的最佳选择。因此,我们建议利用OpenMP 4.5的新功能的途径,因为它提供了可以充分利用异构体系结构的新指令,特别是那些在CPU和GPGPU之间具有强大协作能力的体系结构,因此它比以前的结果好一个数量级。

著录项

  • 来源
  • 作者单位

    Bar Ilan Univ Dept Phys IL-52900 Ramat Gan Israel|Israel Atom Energy Commiss POB 7061 IL-61070 Tel Aviv Israel;

    Ben Gurion Univ Negev Dept Comp Sci POB 653 Beer Sheva Israel|Nucl Res Ctr Negev Dept Phys POB 9001 Beer Sheva Israel;

    Nucl Res Ctr Negev Dept Phys POB 9001 Beer Sheva Israel|Open Univ Israel Dept Math & Comp Sci POB 808 Raanana Israel;

    Israel Atom Energy Commiss POB 7061 IL-61070 Tel Aviv Israel|Bar Ilan Univ Dept Comp Sci IL-52900 Ramat Gan Israel;

    Israel Atom Energy Commiss POB 7061 IL-61070 Tel Aviv Israel|Ben Gurion Univ Negev Dept Comp Sci POB 653 Beer Sheva Israel;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Parallel programming; Automatic parallelism; Cetus; AutoPar; Par4All;

    机译:并行编程自动并行性;塞特斯AutoPar;标准杆4杆;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号