...
首页> 外文期刊>ACM Transactions on Graphics >A scalable Schur-complement fluids solver for heterogeneous compute platforms
【24h】

A scalable Schur-complement fluids solver for heterogeneous compute platforms

机译:适用于异构计算平台的可扩展Schur补液求解器

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

We present a scalable parallel solver for the pressure Poisson equationrnin fluids simulation which can accommodate complex irregularrndomains in the order of a billion degrees of freedom, using a singlernserver or workstation fitted with GPU or Many-Core accelerators.rnThe design of our numerical technique is attuned to the subtletiesrnof heterogeneous computing, and allows us to benefit fromrnthe high memory and compute bandwidth of GPU accelerators evenrnfor problems that are too large to fit entirely on GPU memory. Thisrnis achieved via algebraic formulations that adequately increase therndensity of the GPU-hosted computation as to hide the overhead ofrnoffloading from the CPU, in exchange for accelerated convergence.rnOur solver follows the principles of Domain Decomposition techniques,rnand is based on the Schur complement method for ellipticrnpartial differential equations. A large uniform grid is partitioned inrnnon-overlapping subdomains, and bandwidth-optimized (GPU orrnMany-Core) accelerator cards are used to efficiently and concurrentlyrnsolve independent Poisson problems on each resulting subdomain.rnOur novel contributions are centered on the careful steps necessaryrnto assemble an accurate global solver from these constituentrnblocks, while avoiding excessive communication or dense linear algebra.rnWe ultimately produce a highly effective Conjugate Gradientsrnpreconditioner, and demonstrate scalable and accurate performancernon high-resolution simulations of water and smoke flow.
机译:我们为流体泊松方程仿真提供了一种可扩展的并行求解器,可以使用装有GPU或Many-Core加速器的单服务器或工作站,在十亿个自由度的数量级中容纳复杂的不规则域,我们的数值技术设计经过了调整到异构计算的精妙之处,并允许我们受益于GPU加速器的高内存和计算带宽,甚至可以解决由于问题太大而无法完全容纳在GPU内存中的问题。 Thisrnis是通过代数公式实现的,该公式充分提高了GPU托管计算的密度,从而隐藏了CPU卸载的开销,以换取加速收敛。rn我们的求解器遵循领域分解技术的原理,并且基于Schur互补方法进行求解椭圆偏微分方程。大型统一网格在不重叠的子域之间进行了分区,带宽优化(GPU或多核)加速器卡用于高效且同时解决每个子域上的独立Poisson问题。rn我们的新颖贡献集中在组装准确的必要步骤上这些组成块的全局求解器,同时避免了过多的交流或密集的线性代数。我们最终生产出了一种高效的共轭梯度预处理器,并演示了可扩展且准确的性能-水和烟流的非高分辨率模拟。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号