首页> 外文会议>IEEE Global Conference on Signal and Information Processing >A Fast Parallel Matrix Inversion Algorithm based on Heterogeneous Multicore Architectures
【24h】

A Fast Parallel Matrix Inversion Algorithm based on Heterogeneous Multicore Architectures

机译:基于异构多核架构的快速并行矩阵反演算法

获取原文

摘要

Large matrix inversion is usually a basic step in a wide range of signal processing or numerical problems, such as digital filtering, equalization detection, and etc. It is essential to figure out an algorithm to invert large matrix quickly and accurately. On the other hand, the Graphics Processor Unit (GPU) is able to provide a low-cost and flexible multicore architecture for high performance computing, which has attracted many researchers' attention for the building of GPU-based software-defined radio (SDR). In this paper, we propose a fast parallel algorithm for matrix inversion on heterogeneous multicore architectures to utilize the computational power of GPU. Our implementation is based on a modified Squared Givens Rotations (SGR) algorithm, which could adapt to the GPU architecture effectively. The result implemented on Compute Unified Device Architecture (CUDA) obtains a speedup ratio more than 20x versus the CPU-based-only algorithm when the matrix become large, and runs at up to 12.14 gigaflops/s on a graphics processor Geforce GT620 in our implementation.
机译:大矩阵反转通常是在广泛的信号处理或数值问题中的基本步骤,例如数字滤波,均衡检测等。重要的是要快速准确地反转大矩阵来反转大矩阵。另一方面,图形处理器单元(GPU)能够为高性能计算提供低成本和灵活的多核架构,它吸引了许多研究人员对基于GPU的软件定义的无线电(SDR)的关注。在本文中,我们提出了一种在异构多核架构上的矩阵反转算法,利用GPU的计算能力。我们的实现基于修改的平方Givens旋转(SGR)算法,它可以有效地适应GPU架构。在计算统一设备架构(CUDA)上实现的结果在矩阵变大时获得了超过20倍,而不是基于CPU的基于CPU的算法,并且在我们的实施中的图形处理器GEForce GT620上运行了最多12.14 gigaflops / s 。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号