首页> 外文会议>International Conference on Computational Science >Development of Element-by-Element Kernel Algorithms in Unstructured Implicit Low-Order Finite-Element Earthquake Simulation for Many-Core Wide-SIMD CPUs
【24h】

Development of Element-by-Element Kernel Algorithms in Unstructured Implicit Low-Order Finite-Element Earthquake Simulation for Many-Core Wide-SIMD CPUs

机译:多核宽SIMD CPU的非结构化隐式低阶有限元地震仿真中逐元素内核算法的开发

获取原文

摘要

Acceleration of the Element-by-Element (EBE) kernel in matrix-vector products is essential for high-performance in unstructured implicit finite-element applications. However, the EBE kernel is not straightforward to attain high performance due to random data access with data recurrence. In this paper, we develop methods to circumvent these data races for high performance on many-core CPU architectures with wide SIMD units. The developed EBE kernel attains 16.3% and 20.9% of FP32 peak on Intel Xeon Phi Knights Landing based Oakforest-PACS and Intel Skylake Xeon Gold processor based system, respectively. This leads to 2.88-fold speedup over the baseline kernel and 2.03-fold speedup of the whole finite-element application on Oakforest-PACS. An example of urban earthquake simulation using the developed finite-element application is shown.
机译:矩阵向量乘积中逐元素(EBE)内核的加速对于非结构化隐式有限元应用程序中的高性能至关重要。但是,由于具有数据重复性的随机数据访问,因此EBE内核并不容易直接获得高性能。在本文中,我们开发了规避这些数据争用的方法,以在具有宽SIMD单元的多核CPU架构上实现高性能。在基于Intel Xeon Phi Knights Landing的Oakforest-PACS和基于Intel Skylake Xeon Gold处理器的系统上,开发的EBE内核分别达到FP32峰值的16.3%和20.9%。这导致在基准内核上的速度提高了2.88倍,在Oakforest-PACS上的整个有限元应用程序的速度提高了2.03倍。给出了使用开发的有限元应用程序进行城市地震模拟的示例。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号