FPGA-Based Hardware Matrix Inversion Architecture Using Hybrid Piecewise Polynomial Approximation Systolic Cells

首页> 外文期刊>Progress in Artificial Intelligence >FPGA-Based Hardware Matrix Inversion Architecture Using Hybrid Piecewise Polynomial Approximation Systolic Cells

【24h】

FPGA-Based Hardware Matrix Inversion Architecture Using Hybrid Piecewise Polynomial Approximation Systolic Cells

机译：基于FPGA的硬件矩阵反转架构使用混合分段多项式近似收缩细胞

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The hardware of the matrix inversion architecture using QR decomposition with Givens Rotations (GR) and a back substitution (BS) block is required for many signal processing algorithms. However, the hardware of the GR algorithm requires the implementation of complex operations, such as the reciprocal square root (RSR), which is typically implemented using LookUp Table (LUT) and COordinate Rotation DIgital Computer (CORDICs), among others, conveying to either high-area consumption or low throughput. This paper introduces an Field-Programmable Gate Array (FPGA)-based full matrix inversion architecture using hybrid piecewise polynomial approximation systolic cells. In the design, a hybrid segmentation technique was incorporated for the implementation of piecewise polynomial systolic cells. This hybrid approach is composed by an external and internal segmentation, where the first is nonuniform and the second is uniform, fitting the curve shape of the complex functions achieving a better signal-quantization-to noise-ratio; furthermore, it improves the time performance and area resources. Experimental results reveal a well-balanced improvement in the design achieving high throughput and, hence, less resource utilization in comparison to state-of-the-art FPGA-based architectures. In our study, the proposed design achieves 7.51 Mega-Matrices per second for performing 4 x 4 matrix operations with a latency of 12 clock cycles; meanwhile, the hardware design requires only 1474 slice registers, 1458 LUTs in an FPGA Virtex-5 XC5VLX220T, and 1474 slice registers and 1378 LUTs when a FPGA Virtex-6 XC6VLX240T is used.

机译：对于许多信号处理算法，需要使用与Givens旋转（GR）和后替换（BS）块使用QR分解的矩阵反转架构的硬件。但是，GR算法的硬件需要实现复杂的操作，例如倒数平方根（RSR），其通常使用查找表（LUT）和坐标旋转数字计算机（CORDICS）等来实现传送到任何一种高面积消耗或低吞吐量。本文介绍了一种现场可编程门阵列（FPGA） - 基于混合分段多项式近似收缩单元的全矩阵反转架构。在设计中，结合了混合分割技术，用于实施分段多项式收缩细胞。这种混合方法由外部和内部分割组成，其中第一是不均匀的，第二是均匀的，拟合复杂功能的曲线形状，实现更好的信号量化 - 噪声比。此外，它提高了时间性能和面积资源。实验结果揭示了实现高吞吐量的设计良好的改进，从而与最先进的基于FPGA的架构相比，资源利用较少。在我们的研究中，拟议的设计实现了7.51兆矩阵，每秒执行4×4矩阵操作，延迟为12个时钟周期;同时，硬件设计只需要1474个切片寄存器，在FPGA Virtex-5 XC5VLX220T中的1458 LUT，1474片寄存器和1378 LUT时使用，当使用FPGA Virtex-6 XC6VLX240T时。

著录项

来源
《Progress in Artificial Intelligence》 |2020年第1期|共14页
作者

展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
field programmable gate arrays; matrix inversion; piecewise polynomial approximation; QR decomposition; systolic arrays;

机译：现场可编程门阵列;矩阵反转;分段多项式近似;QR分解;收缩阵列;

相似文献

外文文献
中文文献
专利

1. FPGA-Based Hardware Matrix Inversion Architecture Using Hybrid Piecewise Polynomial Approximation Systolic Cells [J] . Progress in Artificial Intelligence . 2020,第1期

机译：基于FPGA的硬件矩阵反转架构使用混合分段多项式近似收缩细胞
2. Hardware-efficient systolic architecture for inversion and division in GF(2/sup m/) [J] . Guo J.-H., Wang C.-L. IEE proceedings. Part E . 1998,第4期

机译：GF（2 / sup m /）中用于反转和除法的硬件有效的脉动体系结构
3. Hardware-efficient systolic architecture for inversion and division in GF(2~m) [J] . J .-H. Guo, C.-L. Wang IEE proceedings. Part E . 1998,第4期

机译：GF（2〜m）中用于反转和除法的硬件有效的脉动架构
4. An improved hardware design for matrix inverse based on systolic array QR decomposition and piecewise polynomial approximation [C] . L. Canche Santos, A. Castillo Atoche, J. Vazquez Castilloy, International Conference on Reconfigurable Computing and FPGAs . 2015

机译：基于脉动阵列QR分解和分段多项式逼近的矩阵求逆的改进硬件设计
5. Continuous piecewise polynomial approximations in cloud network forensics. [D] . Perry, Alexander K. 2014

机译：云网络取证中的连续分段多项式逼近。
6. PIECEWISE POLYNOMIAL APPROXIMATIONS TO THE ICRP 116 EFFECTIVE DOSE COEFFICIENTS: PHOTONS AND NEUTRONS [O] . M M Mille, N E Hertel, P M Bergstrom, -1

机译：ICRP 116有效剂量系数的分段多项式近似：光子和中子
7. Exploration of FPGA-Based Hardware Designs for QR Decomposition for Solving Stiff ODE Numerical Methods Using the HARP Hybrid Architecture [O] . Carlos Alberto Oliveira de Souza Junior, João Bispo, João M. P. Cardoso, 2020

机译：基于FPGA的硬件设计探讨了QR分解，用于使用HARP混合架构解决硬颂数值
8. Trial Functions for the Space Dependence of the LIFE Fuel-Pin Performance Equations/Piecewise Cubic Polynomial Interpolation Approximations [R] . O'Reilly, B. D. 1975

机译：LIFE燃料 - 销性能方程/分段立方多项式插值近似的空间依赖性的试验函数

FPGA-Based Hardware Matrix Inversion Architecture Using Hybrid Piecewise Polynomial Approximation Systolic Cells

摘要

著录项

相似文献

相关主题

期刊订阅