首页> 外文会议>International Conference for High Performance Computing, Networking, Storage and Analysis >A Study on Balancing Parallelism, Data Locality, and Recomputation in Existing PDE Solvers
【24h】

A Study on Balancing Parallelism, Data Locality, and Recomputation in Existing PDE Solvers

机译:在现有PDE解算器中平衡并行性,数据局部性和计算的研究

获取原文

摘要

Structured-grid PDE solver frameworks parallelize over boxes, which are rectangular domains of cells or faces in a structured grid. In the Chombo framework, the box sizes are typically 163 or 323, but larger box sizes such as 1283 would result in less surface area and therefore less storage, copying, and/or ghost cells communication overhead. Unfortunately, current on node parallelization schemes perform poorly for these larger box sizes. In this paper, we investigate 30 different inter-loop optimization strategies and demonstrate the parallel scaling advantages of some of these variants on NUMA multicore nodes. Shifted, fused, and communication-avoiding variants for 1283 boxes result in close to ideal parallel scaling and come close to matching the performance of 163 boxes on three different multicore systems for a benchmark that is a proxy for program idioms found in Computational Fluid Dynamic (CFD) codes.
机译:结构化网格的PDE求解器框架在框上并行化,框是结构化网格中单元或面的矩形区域。在Chombo框架中,框的大小通常为163或323,但是较大的框(例如1283)将导致较小的表面积,并因此导致较少的存储,复制和/或幽灵单元通信开销。不幸的是,当前的节点并行化方案对于这些较大的盒子尺寸表现不佳。在本文中,我们研究了30种不同的环间优化策略,并展示了其中一些变体在NUMA多核节点上的并行扩展优势。 1283盒的移位,融合和通信避免变体可实现接近理想的并行缩放,并接近于在三个不同的多核系统上匹配163盒的性能,从而成为基准,该基准可替代Computational Fluid Dynamic( CFD)代码。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号