A Fast Parallel Matrix Inversion Algorithm based on Heterogeneous Multicore Architectures

机译：基于异构多核架构的快速并行矩阵反演算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Large matrix inversion is usually a basic step in a wide range of signal processing or numerical problems, such as digital filtering, equalization detection, and etc. It is essential to figure out an algorithm to invert large matrix quickly and accurately. On the other hand, the Graphics Processor Unit (GPU) is able to provide a low-cost and flexible multicore architecture for high performance computing, which has attracted many researchers' attention for the building of GPU-based software-defined radio (SDR). In this paper, we propose a fast parallel algorithm for matrix inversion on heterogeneous multicore architectures to utilize the computational power of GPU. Our implementation is based on a modified Squared Givens Rotations (SGR) algorithm, which could adapt to the GPU architecture effectively. The result implemented on Compute Unified Device Architecture (CUDA) obtains a speedup ratio more than 20x versus the CPU-based-only algorithm when the matrix become large, and runs at up to 12.14 gigaflops/s on a graphics processor Geforce GT620 in our implementation.

机译：大矩阵反转通常是在广泛的信号处理或数值问题中的基本步骤，例如数字滤波，均衡检测等。重要的是要快速准确地反转大矩阵来反转大矩阵。另一方面，图形处理器单元（GPU）能够为高性能计算提供低成本和灵活的多核架构，它吸引了许多研究人员对基于GPU的软件定义的无线电（SDR）的关注。在本文中，我们提出了一种在异构多核架构上的矩阵反转算法，利用GPU的计算能力。我们的实现基于修改的平方Givens旋转（SGR）算法，它可以有效地适应GPU架构。在计算统一设备架构（CUDA）上实现的结果在矩阵变大时获得了超过20倍，而不是基于CPU的基于CPU的算法，并且在我们的实施中的图形处理器GEForce GT620上运行了最多12.14 gigaflops / s 。

著录项

来源
《IEEE Global Conference on Signal and Information Processing》|2015年||共5页
会议地点
作者
Denggao Yu; Shiwen He; Yongming Huang; Guangshi Yu; Luxi Yang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN911.7-53;
关键词
matrix inversion; high performance computing; software-defined radio; GPU; CUDA;

机译：矩阵反转;高性能计算;软件定义的无线电;GPU;CUDA;

相似文献

外文文献
中文文献
专利

1. Parallelization Strategy for Elementary Morphological Operators on Graphs: Distance-Based Algorithms and Implementation on Multicore Shared-Memory Architecture [J] . Youkana Imane, Cousty Jean, Saouli Rachida, Journal of mathematical imaging and vision . 2017,第1期

机译：基本形态运算符对图中的并行化策略：基于距离的算法和多核共享内存架构的实现
2. Adaptive parallel interval branch and bound algorithms based on their performance for multicore architectures [J] . J. F. Sanjuan-Estrada, L. G. Casado, I. García The Journal of Supercomputing . 2011,第3期

机译：基于多核架构性能的自适应并行间隔分支定界算法
3. Adaptive parallel interval branch and bound algorithms based on their performance for multicore architectures [J] . J.F. Sanjuan-Estrada, L.G. Casado, I. Garcia Journal of supercomputing . 2011,第3期

机译：基于多核架构性能的自适应并行间隔分支定界算法
4. A Fast Parallel Matrix Inversion Algorithm based on Heterogeneous Multicore Architectures [C] . Denggao Yu, Shiwen He, Yongming Huang, IEEE Global Conference on Signal and Information Processing . 2015

机译：基于异构多核架构的快速并行矩阵反演算法
5. Three-dimensional inversion of magnetotelluric data from the Coso Geothermal Field, based on a finite difference Gauss-Newton method parallelized on a multicore workstation. [D] . Maris, Virginie. 2011

机译：基于在多核工作站上并行化的有限差分高斯-牛顿法，对来自Coso地热场的大地电磁数据进行了三维反演。
6. A Parallel Point Matching Algorithm for Landmark Based Image Registration Using Multicore Platform [O] . Lin Yang, Leiguang Gong, Hong Zhang, -1

机译：并行点匹配算法基于地标的图像配准多核平台
7. Network Coding Parallelization Based on Matrix Operations for Multicore Architectures [O] . Wunderlich Simon, Cabrera Juan, Fitzek Frank, 2015

机译：基于矩阵运算的多核架构网络编码并行化

A Fast Parallel Matrix Inversion Algorithm based on Heterogeneous Multicore Architectures

摘要

著录项

相似文献

相关主题

期刊订阅