首页> 外文期刊>Parallel Computing >Towards a more efficient implementation of OpenMP for clusters via translation to global arrays
【24h】

Towards a more efficient implementation of OpenMP for clusters via translation to global arrays

机译:通过转换为全局数组以实现集群的OpenMP的更有效实现

获取原文
获取原文并翻译 | 示例
           

摘要

This paper discusses a novel approach to implementing OpenMP on clusters. Traditional approaches to do so rely on Software Distributed Shared Memory systems to handle shared data. We discuss these and then introduce an alternative approach that translates OpenMP to Global Arrays (GA), explaining the basic strategy. GA requires a data distribution. We do not expect the user to supply this; rather, we show how we perform data distribution and work distribution according to the user-supplied OpenMP static loop schedules. An inspector-executor strategy is employed for irregular applications in order to gather information on accesses to potentially non-local data, group non-local data transfers and overlap communications with local computations. Furthermore, a new directive INVARIANT is proposed to provide information about the dynamic scope of data access patterns. This directive can help us generate efficient codes for irregular applications using the inspector-executor approach. We also illustrate how to deal with some hard cases containing reshaping and strided accesses during the translation. Our experiments show promising results for the corresponding regular and irregular GA codes.
机译:本文讨论了一种在群集上实施OpenMP的新颖方法。这样做的传统方法依赖于软件分布式共享内存系统来处理共享数据。我们讨论了这些内容,然后介绍了将OpenMP转换为全局数组(GA)的另一种方法,并解释了基本策略。 GA需要进行数据分发。我们不希望用户提供此信息。相反,我们展示了如何根据用户提供的OpenMP静态循环计划执行数据分配和工作分配。检查器-执行器策略用于不规则应用程序,以便收集有关访问潜在非本地数据,将非本地数据传输分组以及与本地计算重叠的通信的信息。此外,提出了新的指令INVARIANT,以提供有关数据访问模式动态范围的信息。该指令可以帮助我们使用检查器/执行器方法为不规则应用程序生成有效的代码。我们还将说明如何处理翻译过程中包含重塑和跨步访问的一些困难情况。我们的实验显示了相应的规则和不规则GA代码的良好结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号