...
首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Analysis of GPU Data Access Patterns on Complex Geometries for the D3Q19 Lattice Boltzmann Algorithm
【24h】

Analysis of GPU Data Access Patterns on Complex Geometries for the D3Q19 Lattice Boltzmann Algorithm

机译:D3Q19晶格Boltzmann算法复杂几何上的GPU数据访问模式分析

获取原文
获取原文并翻译 | 示例
           

摘要

GPU performance of the lattice Boltzmann method (LBM) depends heavily on memory access patterns. When implemented with GPUs on complex domains, typically, geometric data is accessed indirectly and lattice data is accessed lexicographically. Although there are a variety of other options, no study has examined the relative efficacy between them. Here, we examine a suite of memory access schemes via empirical testing and performance modeling. We find strong evidence that semi-direct is often better suited than the more common indirect addressing, providing increased computational speed and reducing memory consumption. For the layout, we find that the Collected Structure of Arrays (CSoA) and bundling layouts outperform the common Structure of Array layout; on V100 and P100 devices, CSoA consistently outperforms bundling, however the relationship is more complicated on K40 devices. When compared to state-of-the-art practices, our recommendations lead to speedups of 10-40 percent and reduce memory consumption up to 17 percent. Using performance modeling and computational experimentation, we determine the mechanisms behind the accelerations. We demonstrate that our results hold across multiple GPUs on two leadership class systems, and present the first near-optimal strong results for LBM with arterial geometries run on GPUs.
机译:格子Boltzmann方法(LBM)的GPU性能大量取决于内存访问模式。当在复杂域上使用GPU实现时,通常,间接访问几何数据,并在lexically访问格子数据。虽然有各种其他选择,但没有研究检查它们之间的相对效果。在这里,我们通过经验测试和性能建模检查一系列内存访问方案。我们发现强有力的证据表明,半直接通常更适合比更常见的间接寻址,提供更高的计算速度和降低内存消耗。对于布局,我们发现阵列的收集结构(CSOA)和捆绑布局优于阵列布局的共同结构;在V100和P100设备上,CSOA一致优于捆绑,但是在K40器件上关系更复杂。与最先进的实践相比,我们的建议导致加速10-40%,降低内存消耗高达17%。使用性能建模和计算实验,我们确定加速后面的机制。我们证明,我们的结果在两个领导类系统上跨越多个GPU,并在GPU上运行动脉几何的LBM的第一个接近最佳强度结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号