首页> 外文期刊>Journal of supercomputing >From tile algorithm to stripe algorithm: a CUBLAS-based parallel implementation on GPUs of Gauss method for the resolution of extremely large dense linear systems stored on an array of solid state devices
【24h】

From tile algorithm to stripe algorithm: a CUBLAS-based parallel implementation on GPUs of Gauss method for the resolution of extremely large dense linear systems stored on an array of solid state devices

机译:从图块算法到条带算法:高斯方法的GPU上基于CUBLAS的并行实现,用于解析存储在固态设备阵列上的超大型密集线性系统

获取原文
获取原文并翻译 | 示例
       

摘要

This paper presents an efficient algorithmic approach to the GPU-based parallel resolution of dense linear systems of extremely large size. A formal transformation of the code of Gauss method allows us to develop for matrix calculations the concept of stripe algorithm, as opposed to that of tile algorithm. Our stripe algorithm is based on the partitioning of the linear system's matrix into stripes of rows and is well suited for efficient implementation on a GPU, using cublasDgemm function of CUBLAS library as the main building block. It is also well adapted to storage of the linear system on an array of solid state devices, the PC memory being used as a cache between the SSDs and the GPU memory. We demonstrate experimentally that our code solves efficiently dense linear systems of size up to 400,000 (160 billion matrix elements) using an NIVDIA ∈2050 and six 240 GB SSDs.
机译:本文提出了一种高效的算法方法,用于基于GPU的超大型密集线性系统的并行分辨率。高斯方法代码的形式转换使我们能够为矩阵计算开发条带算法的概念,而与图块算法相反。我们的条带算法基于将线性系统的矩阵划分为行的条带,并且以CUBLAS库的cublasDgemm函数为主要构建块,非常适合在GPU上高效实现。它还非常适合将线性系统存储在一系列固态设备上,其中PC内存用作SSD和GPU内存之间的缓存。我们通过实验证明,我们的代码使用NIVDIA∈2050和六个240 GB SSD可以有效解决高达40万个(1600亿个矩阵元素)大小的密集线性系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号