首页> 外文期刊>Parallel Computing >Design and initial performance of a high-level unstructured mesh framework on heterogeneous parallel systems
【24h】

Design and initial performance of a high-level unstructured mesh framework on heterogeneous parallel systems

机译:异构并行系统上高级非结构化网格框架的设计和初始性能

获取原文
获取原文并翻译 | 示例

摘要

OP2 is a high-level domain specific library framework for the solution of unstructured mesh-based applications. It utilizes source-to-source translation and compilation so that a single application code written using the OP2 AP1 can be transformed into multiple parallel implementations for execution on a range of back-end hardware platforms. In this paper we present the design and performance of OP2's recent developments facilitating code generation and execution on distributed memory heterogeneous systems. OP2 targets the solution of numerical problems based on static unstructured meshes. We discuss the main design issues in parallelizing this class of applications. These include handling data dependencies in accessing indirectly referenced data and design considerations in generating code for execution on a cluster of multi-threaded CPUs and GPUs. Two representative CFD applications, written using the OP2 framework, are utilized to provide a contrasting benchmarking and performance analysis study on a number of heterogeneous systems including a large scale Cray XE6 system and a large GPU cluster. A range of performance metrics are benchmarked including runtime, scalability, achieved compute and bandwidth performance, runtime bottlenecks and systems energy consumption. We demonstrate that an application written once at a high-level using OP2 is easily portable across a wide range of contrasting platforms and is capable of achieving near-optimal performance without the intervention of the domain application programmer.
机译:OP2是用于非结构化基于网格的应用程序解决方案的高级领域特定库框架。它利用源到源的转换和编译功能,使使用OP2 AP1编写的单个应用程序代码可以转换为多个并行实现,以在一系列后端硬件平台上执行。在本文中,我们介绍了OP2最新开发的设计和性能,该开发有助于在分布式内存异构系统上生成代码和执行代码。 OP2的目标是基于静态非结构化网格的数值问题的解决方案。我们讨论了并行化此类应用程序的主要设计问题。其中包括在访问间接引用的数据时处理数据依赖关系,以及在生成代码以在多线程CPU和GPU集群上执行时设计注意事项。利用OP2框架编写的两个具有代表性的CFD应用程序被用于对许多异构系统(包括大规模Cray XE6系统和大型GPU集群)进行对比的基准测试和性能分析研究。对一系列性能指标进行了基准测试,包括运行时,可伸缩性,已实现的计算和带宽性能,运行时瓶颈和系统能耗。我们证明,使用OP2进行一次高层编写的应用程序可以轻松地在各种对比平台上移植,并且能够在域应用程序程序员不干预的情况下实现接近最佳的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号