首页> 外文期刊>Computing >GPU-based matrix-free finite element solver exploiting symmetry of elemental matrices
【24h】

GPU-based matrix-free finite element solver exploiting symmetry of elemental matrices

机译:基于GPU的矩阵有限元求解器利用元素矩阵的对称性

获取原文
获取原文并翻译 | 示例

摘要

Matrix-free solvers for finite element method (FEM) avoid assembly of elemental matrices and replace sparse matrix-vector multiplication required in iterative solution method by an element level dense matrix-vector product. In this paper, a novel matrix-free strategy for FEM is proposed which computes element level matrix-vector product by using only the symmetric part of the elemental matrices. The proposed strategy is developed to take advantage of the massive parallelism of Graphics Processing Unit (GPU). A unique data structure is also introduced which ensures localized and coalesced memory access suitable for a GPU while storing only the symmetric part of the elemental matrices. In addition, the proposed strategy emphasizes the efficient use of register cache, uniform workload distribution, reducing thread synchronization, and maintaining sufficient granularity to make the best use of GPU resources. The performance of the proposed strategy is evaluated by solving elasticity and heat conduction problems using 4-noded quadrilateral element with two degrees of freedom (DOFs) and one DOF per node, respectively. The performance is compared with the matrix-free solver strategies on GPU from the literature. It is found that a maximum speedup of 4.9 x is obtained for the elasticity problem and a maximum of 3.2 x speedup for the heat conduction problem. Further, the proposed strategy takes the least amount of GPU memory as compared to the existing strategies.
机译:用于有限元方法(FEM)的无矩阵溶剂避免元素矩阵的组装,并通过元素级致密基质 - 矢量产品替换迭代解方法中所需的稀疏矩阵载体倍增。在本文中,提出了一种用于FEM的新矩阵策略,其通过仅使用元素矩阵的对称部分计算元素级矩阵矢量产品。制定了拟议的策略,以利用图形处理单元(GPU)的大规模平行性。还介绍了一种独特的数据结构,其确保了适用于GPU的本地化和聚结的存储器访问,同时仅存储元素矩阵的对称部分。此外,拟议的策略强调了寄存器缓存的有效使用,统一的工作量分布,减少线程同步,并保持足够的粒度以充分利用GPU资源。通过使用具有两度自由度(DOF)和每个节点的一个DOF的4-编码的四边形元素来求解弹性和导热问题来评估所提出的策略的性能。将性能与文献中GPU的矩阵求解策略进行比较。结果发现,为弹性问题获得了4.9×的最大加速度,最大为3.2倍的热传导问题。此外,与现有策略相比,拟议的策略占GPU内存最少。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号