首页> 外文学位 >A Fast Poisson Solver with Periodic Boundary Conditions for GPU Clusters in Various Configurations.
【24h】

A Fast Poisson Solver with Periodic Boundary Conditions for GPU Clusters in Various Configurations.

机译:具有周期性边界条件的快速泊松解算器,用于各种配置的GPU群集。

获取原文
获取原文并翻译 | 示例

摘要

Fast Poisson solvers using the Fast Fourier Transform on uniform grids are especially suited for parallel implementation, making them appropriate for portability on graphical processing unit (GPU) devices.;The goal of the following work was to implement, test, and evaluate a fast Poisson solver for periodic boundary conditions for use on a variety of GPU configurations. The solver used in this research was FLASH, an immersed-boundary-based method, which is well suited for complex, time-dependent geometries, has robust adaptive mesh refinement/de-refinement capabilities to capture evolving flow structures, and has been successfully implemented on conventional, parallel supercomputers. However, these solvers are still computationally costly to employ, and the total solver time is dominated by the solution of the pressure Poisson equation using state-of-the-art multigrid methods. FLASH improves the performance of its multigrid solvers by integrating a parallel FFT solver on a uniform grid during a coarse level. This hybrid solver could then be theoretically improved by replacing the highly-parallelizable FFT solver with one that utilizes GPUs, and, thus, was the motivation for my research.;In the present work, the CPU-utilizing parallel FFT solver (PFFT) used in the base version of FLASH for solving the Poisson equation on uniform grids has been modified to enable parallel execution on CUDA-enabled GPU devices. New algorithms have been implemented to replace the Poisson solver that decompose the computational domain and send each new block to a GPU for parallel computation. One-dimensional (1-D) decomposition of the computational domain minimizes the amount of network traffic involved in this bandwidth-intensive computation by limiting the amount of all-to-all communication required between processes. Advanced techniques have been incorporated and implemented in a GPU-centric code design, while allowing end users the flexibility of parameter control at runtime in order to maximize throughput with all data sizes. The new code also allows the use of multiple GPU devices in a variety of configurations. The elapsed solution time for the newly implemented GPU-based solvers for a Poisson equation with known source terms demonstrate speed-ups of up to 3.5 times faster than the CPU-based solver.
机译:在统一网格上使用快速傅立叶变换的快速泊松求解器特别适合于并行实施,使其适合在图形处理单元(GPU)设备上的可移植性。以下工作的目标是实施,测试和评估快速泊松用于各种GPU配置的周期性边界条件的求解器。本研究中使用的求解器是FLASH,这是一种基于浸入边界的方法,非常适用于复杂的,随时间变化的几何形状,具有强大的自适应网格细化/反细化功能以捕获不断变化的流结构,并已成功实现在传统的并行超级计算机上。但是,这些求解器的使用仍然在计算上昂贵,并且总求解器时间由使用最新的多重网格方法的压力泊松方程的求解所决定。 FLASH通过在粗略水平上将并行FFT求解器集成在均匀网格上来提高其多网格求解器的性能。从理论上讲,可以通过使用GPU替代高度可并行化的FFT求解器来改进这种混合求解器,因此这是我进行研究的动机。在当前的工作中,使用了CPU利用率的并行FFT求解器(PFFT)基本版本的FLASH中用于解决均匀网格上的泊松方程的问题已被修改,以允许在支持CUDA的GPU设备上并行执行。已经实现了新算法来替代分解计算域并将每个新块发送到GPU进行并行计算的Poisson求解器。通过限制进程之间所需的全部通信量,计算域的一维(1-D)分解可最大程度地减少此带宽密集型计算中涉及的网络流量。先进技术已被纳入并以GPU为中心的代码设计中实现,同时允许最终用户在运行时灵活控制参数,以最大程度地提高所有数据大小的吞吐量。新代码还允许在各种配置中使用多个GPU设备。对于带有已知源项的泊松方程,新实现的基于GPU的求解器的求解时间表明,其加速速度比基于CPU的求解器快3.5倍。

著录项

  • 作者

    Rattermann, Dale Nicholas.;

  • 作者单位

    University of Cincinnati.;

  • 授予单位 University of Cincinnati.;
  • 学科 Aerospace engineering.;Computer science.
  • 学位 M.S.
  • 年度 2014
  • 页码 61 p.
  • 总页数 61
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号