首页> 外文会议>International Conference on Parallel Processing Workshops >Using HPX and OP2 for Improving Parallel Scaling Performance of Unstructured Grid Applications
【24h】

Using HPX and OP2 for Improving Parallel Scaling Performance of Unstructured Grid Applications

机译:使用HPX和OP2改善非结构化网格应用程序的并行扩展性能

获取原文

摘要

Computer scientists and programmers face the difficultly of improving the scalability of their applications while using conventional programming techniques only. As a base-line hypothesis of this paper we assume that an advanced runtime system can be used to take full advantage of the available parallel resources of a machine in order to achieve the highest parallelism possible. In this paper we present the capabilities of HPX - a distributed runtime system for parallel applications of any scale - to achieve the best possible scalability through asynchronous task execution [1]. OP2 is an active library which provides a framework for the parallel execution for unstructured grid applications on different multi-core/many-core hardware architectures [2]. OP2 generates code which uses OpenMP for loop parallelization within an application code for both single-threaded and multi-threaded machines. In this work we modify the OP2 code generator to target HPX instead of OpenMP, i.e. port the parallel simulation backend of OP2 to utilize HPX. We compare the performance results of the different parallelization methods using HPX and OpenMP for loop parallelization within the Airfoil application. The results of strong scaling and weak scaling tests for the Airfoil application on one node with up to 32 threads are presented. Using HPX for parallelization of OP2 gives an improvement in performance by 5%-21%. By modifying the OP2 code generator to use HPX's parallel algorithms, we observe scaling improvements by about 5% as compared to OpenMP. To fully exploit the potential of HPX, we adapted the OP2 API to expose a future and dataflow based programming model and applied this technique for parallelizing the same Airfoil application. We show that the dataflow oriented programming model, which automatically creates an execution tree representing the algorithmic data dependencies of our application, improves the overall scaling results by about 21% compared to OpenMP. Our results show the advantage of using the asynchronous programming model implemented by HPX.
机译:仅使用常规编程技术时,计算机科学家和程序员面临着提高应用程序可伸缩性的困难。作为本文的基准假设,我们假设可以使用高级运行时系统来充分利用机器的可用并行资源,以实现最高的并行度。在本文中,我们介绍了HPX的功能-一种适用于任何规模的并行应用程序的分布式运行时系统-通过异步任务执行来实现最佳的可伸缩性[1]。 OP2是一个活动库,为不同结构的多核/多核硬件架构上的非结构化网格应用程序提供了并行执行的框架[2]。 OP2生成的代码在单线程和多线程计算机的应用程序代码中使用OpenMP进行循环并行化。在这项工作中,我们将OP2代码生成器修改为以HPX而不是OpenMP为目标,即移植OP2的并行模拟后端以利用HPX。我们比较了使用HPX和OpenMP在Airfoil应用程序中进行循环并行化的不同并行化方法的性能结果。给出了在多达32个线程的一个节点上对Airfoil应用程序进行强缩放和弱缩放测试的结果。使用HPX进行OP2的并行化可使性能提高5%-21%。通过修改OP2代码生成器以使用HPX的并行算法,我们发现与OpenMP相比,缩放比例提高了大约5%。为了充分利用HPX的潜力,我们对OP2 API进行了修改,以公开基于未来和基于数据流的编程模型,并将此技术应用于并行化同一机翼应用程序。我们展示了面向数据流的编程模型,该模型自动创建一个表示我们应用程序的算法数据依存关系的执行树,与OpenMP相比,整体缩放结果提高了约21%。我们的结果显示了使用HPX实现的异步编程模型的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号