首页> 外文期刊>Journal of Parallel and Distributed Computing >FPGA, GPU, and CPU implementations of Jacobi algorithm for eigenanalysis
【24h】

FPGA, GPU, and CPU implementations of Jacobi algorithm for eigenanalysis

机译:用于特征分析的Jacobi算法的FPGA,GPU和CPU实现

获取原文
获取原文并翻译 | 示例

摘要

Parallel implementations of Jacobi algorithm for eigenanalysis of a matrix on most commonly used high performance computing (HPC) devices such as central processing unit (CPU), graphics processing unit (GPU), and field-programmable gate array (FPGA) are discussed in this paper. Their performances are investigated and compared. It is shown that CPU, even with multi-threaded implementation, is not a feasible option for large dense matrices. For the GPU implementation, performance impact of the global memory access patterns on the GPU board and the memory coalescing are emphasized. Three memory access methods are proposed. It is shown that one of them achieves 81.6% computational performance improvement over the traditional GPU methods, and it runs 68.5 times faster than a single-threaded CPU for a dense symmetric square matrix of size 1,024. Furthermore, FPGA implementation is presented and its performance running on chips from two major manufacturers are reported. A comparison of GPU and FPGA implementations is quantified and ranked. It is reported that FPGA design delivers the best performance for such a task while GPU is a strong competitor requiring less development effort with superior scalability. We predict that emerging big data applications will benefit from real-time and high performance computing implementations of eigenanalysis for information inference and signal analytics in the future.
机译:本文讨论了在最常用的高性能计算(HPC)设备(例如中央处理单元(CPU),图形处理单元(GPU)和现场可编程门阵列(FPGA))上对矩阵进行特征分析的Jacobi算法的并行实现。纸。他们的表现进行了调查和比较。结果表明,即使使用多线程实现,对于大型密集矩阵,CPU也不可行。对于GPU实施,强调了全局内存访问模式对GPU板的性能影响和内存合并。提出了三种内存访问方法。结果表明,其中之一可以比传统的GPU方法提高81.6%的计算性能,对于大小为1,024的密集对称方阵,其运行速度比单线程CPU快68.5倍。此外,还介绍了FPGA实现,并报告了其在两个主要制造商的芯片上运行的性能。对GPU和FPGA实现的比较进行了量化和排名。据报道,FPGA设计可为此类任务提供最佳性能,而GPU是强大的竞争对手,需要较少的开发工作并具有出色的可扩展性。我们预测,新兴的大数据应用将受益于信息分析和信号分析的实时和高性能计算实现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号