首页> 外文会议>Simulation Multi-Conference >MATRIX-FREE FINITE-ELEMENT COMPUTATIONS ON GRAPHICS PROCESSORS WITH ADAPTIVELY REFINED UNSTRUCTURED MESHES
【24h】

MATRIX-FREE FINITE-ELEMENT COMPUTATIONS ON GRAPHICS PROCESSORS WITH ADAPTIVELY REFINED UNSTRUCTURED MESHES

机译:图形处理器的无矩阵有限元计算,具有自适应地精制非结构化网格

获取原文

摘要

This paper concerns efficient matrix-free finite-element algorithms on modern manycore processors such as graphics cards (GPUs) as an alternative to sparse matrix-vector products. In matrix-free finite element algorithms, the assembly and solution phases are merged, yielding a significantly lower memory bandwidth footprint, with a corresponding increase in efficiency on bandwidth limited processors. Additionally, no system matrix must be assembled or stored in memory. We present a GPU parallelization of the matrix-free method including a novel algorithm for resolving hanging-node constraints on the GPU, capable of simulation on adaptively refined grids. For second-order elements and higher in 3D, our GPU implementation of the adaptive algorithm is between 1.8 and 2.3 times faster than an existing optimized CPU version, on comparable hardware. Compared to a matrix-based implementation using CUSPARSE, we get a speedup of 8 and can solve problems 8 times larger in 3D.
机译:本文涉及现代多核处理器上有效的矩阵有限元算法,如显卡(GPU)作为稀疏矩阵矢量产品的替代方案。在无矩阵有限元算法中,合并组件和解决方案阶段,产生显着更低的存储带宽足迹,在带宽有限处理器上具有相应的效率提高。此外,必须在内存中组装或存储系统矩阵。我们介绍了一种GPU的PUP并行化,包括用于解决GPU上的悬挂节点约束的新型算法,能够在适自适应的格式网格上进行模拟。对于二阶元素和3D更高,我们的Adaptive算法的GPU实现比现有的硬件上的现有优化CPU版本快于1.8%至2.3倍。与使用Cusparse的基于矩阵的实现相比,我们得到了8的加速,可以解决3D较大的问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号