【24h】

Linear processor array in DSP

机译:DSP中的线性处理器阵列

获取原文
获取原文并翻译 | 示例

摘要

This article presents the design, implementation and performance evaluation of a hardware accelerator for matrix multiplication. The accelerator is loosely coupled with the host computer via common system bus. The accelerator is composed of linear processor array (LPA), distributed memory and dedicated address generator unit. Mathematical procedure for LPA synthesis is given. The speedup of the proposed accelerator for matrix multiplication is O(n over 2), where n is a number of PEs in the array, and the efficiency is 1 over 2. By involving hardware AGU we achieved a speedup in data transfer of approximately 2.5, compared to the software implementation of address calculation, with a hardware overhead less than 1 %.
机译:本文介绍了用于矩阵乘法的硬件加速器的设计,实现和性能评估。加速器通过公共系统总线与主机松散耦合。加速器由线性处理器阵列(LPA),分布式存储器和专用地址生成器单元组成。给出了LPA合成的数学过程。提出的用于矩阵乘法的加速器的加速比为O(n超过2),其中n是阵列中的PE数量,效率为1超过2。通过使用硬件AGU,我们实现了大约2.5的数据传输加速,与地址计算的软件实现相比,硬件开销不到1%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号