A High Throughput FPGA-Based Floating Point Conjugate Gradient Implementation

机译：基于高吞吐量FPGA的浮点共轭梯度实现

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

As Field Programmable Gate Arrays (FPGAs) have reached capacities beyond millions of equivalent gates, it becomes possible to accelerate floating-point scientific computing applications. One type of calculation that is commonplace in scientific computation is the solution of systems of linear equations. A method that has proven in software to be very efficient and robust for finding such solutions is the Conjugate Gradient algorithm. In this paper we present a parallel hardware Conjugate Gradient implementation. The implementation is particularly suited for accelerating multiple small to medium sized dense systems of linear equations. Through parallelization it is possible to convert the computation time per iteration for an order n matrix from Θ(n{sup}2) cycles for a software implementation to Θ(n). I/O requirements are scalable and converge to a constant value with the increase of matrix order. Results on a VirtexII-6000 demonstrate sustained performance of 5 GFLOPS and projected results on a Virtex5-330 indicate sustained performance of 35 GFLOPS. The former result is comparable to high-end CPUs, whereas the latter represents a significant speedup.

机译：作为现场可编程门阵列（FPGA）已达到超过数百万等效门的容量，因此可以加速浮点科学计算应用。科学计算中普遍的一种计算是线性方程系统的解决方案。一种在软件中被证明是非常有效和稳健寻找此类解决方案的方法是共轭梯度算法。在本文中，我们呈现了一个并行硬件共轭梯度实现。实施特别适用于加速多个小于中小型的线性方程的中等大小密集系统。通过并行化，可以将来自θ（n {sup} 2）周期的命令n矩阵的迭代的计算时间转换为θ（n）的软件实现。随着矩阵顺序的增加，I / O要求是可扩展的，并收敛到恒定值。 Virtexii-6000上的结果表明5 GFLOPS的持续性能，并在Virtex5-330上投影结果表明35 GFLOPS的持续性能。前一个结果与高端CPU相当，而后者代表了显着的加速。

著录项

来源
《International Workshop on Reconfigurable Computing》|2008年||共12页
会议地点
作者
Antonio Roldao Lopes; George A. Constantinides;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP3-53;
关键词

相似文献

外文文献
中文文献
专利

1. A High Throughput FPGA-Based Floating Point Conjugate Gradient Implementation for Dense Matrices [J] . ANTONIO ROLDAO, GEORGE A. CONSTANTINIDES ACM transactions on reconfigurable technology and systems . 2010,第1期

机译：密集矩阵基于FPGA的高吞吐量浮点共轭梯度实现
2. SOMprocessor: A high throughput FPGA-based architecture for implementing Self-Organizing Maps and its application to video processing [J] . Neural Networks: The Official Journal of the International Neural Network Society . 2020,第期

机译：Somprocessor：基于高吞吐量FPGA的架构，用于实现自组织地图及其在视频处理中的应用程序
3. High-throughput parallel DWT hardware architecture implemented on an FPGA-based platform [J] . Ibraheem Mohammed Shaaban, Hachicha Khalil, Ahmed Syed Zahid, Journal of Real-Time Image Processing . 2019,第6期

机译：在基于FPGA的平台上实现的高吞吐量并行DWT硬件架构
4. A High Throughput FPGA-Based Floating Point Conjugate Gradient Implementation [C] . Antonio Roldao Lopes, George A. Constantinides Reconfigurable Computing: Architectures, Tools and Applications . 2008

机译：基于FPGA的高吞吐量浮点共轭梯度实现
5. Investigation of anisotropic charge transport in conjugated polymer based organic FETs by controlling the molecular orientation in large area ribbon-shaped floating films [D] . Tripathi Atul Shankar Mani 2019

机译：通过控制大面积带状浮膜中的分子取向研究基于共轭聚合物的有机FET中的各向异性电荷传输
6. FPGA-Based Implementation of Stochastic Configuration Networks for Regression Prediction [O] . Yunqi Gao, Feng Luan, Jiaqi Pan, 2020

机译：基于FPGA的回归预测随机配置网络的实现
7. High Throughput FPGA-based Floating Point Conjugate Gradient Implementation [O] . Antonio Roldao, George A. Constantinides 2013

机译：基于FpGa的高吞吐量浮点共轭梯度实现

A High Throughput FPGA-Based Floating Point Conjugate Gradient Implementation

摘要

著录项

相似文献

相关主题

期刊订阅