首页> 外文期刊>Microprocessors and microsystems >Processor arrays generation for matrix algorithms used in embedded platforms implemented on FPGAs
【24h】

Processor arrays generation for matrix algorithms used in embedded platforms implemented on FPGAs

机译:针对在FPGA上实现的嵌入式平台中使用的矩阵算法的处理器阵列生成

获取原文
获取原文并翻译 | 示例

摘要

Matrix algorithms are an important part of many digital signal processing applications as they are core kernels that are usually required to be applied many times while computing different tasks. Hardware assisted implementations using FPGAs provide a good compromise between performance, cost and power consumption, specially when high level synthesis techniques are employed for deriving co-processors. In this paper a high level synthesis approach to generate embedded processor arrays for matrix algorithms based on the polytope model is presented. The proposed approach provides a solution for efficient data memory accesses and data transferring for feeding the processor array, as well as support for solving problems independently of their size and limited only by the FPGA available resources. The proposed approach has been validated by generating processor arrays for three different matrix algorithms used in digital signal processing applications; more precisely matrix-matrix multiplication, Cholesky and LU decomposition algorithms. These algorithms were targeted for a Spartan-6 device and compared against their sequential implementations targeted for a MicroBlaze processor in order to provide a general view of the gain achieved by the processor arrays when the arrays and sequential processors are implemented in the same technology. Results show that the implemented arrays outperforms hardware and software implementations considering an embedded platforms scenario with a Spartan-6 device. (C) 2015 Elsevier B.V. All rights reserved.
机译:矩阵算法是许多数字信号处理应用程序的重要组成部分,因为它们是核心内核,在计算不同任务时通常需要多次应用。使用FPGA的硬件辅助实现在性能,成本和功耗之间实现了很好的折衷,特别是当采用高级综合技术来推导协处理器时。本文提出了一种基于多表位模型为矩阵算法生成嵌入式处理器阵列的高级综合方法​​。所提出的方法为有效的数据存储器访问和数据传输提供了一种解决方案,以馈送给处理器阵列,并支持独立于其大小且仅受FPGA可用资源限制的问题解决方案。通过为数字信号处理应用中使用的三种不同矩阵算法生成处理器阵列,已验证了该方法的有效性。更准确地说是矩阵矩阵乘法,Cholesky和LU分解算法。这些算法针对Spartan-6器件,并与针对MicroBlaze处理器的顺序实现方案进行了比较,以提供以相同技术实现阵列和顺序处理器时处理器阵列所获得的收益的总体视图。结果表明,考虑到具有Spartan-6器件的嵌入式平台方案,已实现的阵列优于硬件和软件的实现。 (C)2015 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号