【24h】

A Comparison of FPGA Implementations of Bit-Level and Word-Level Matrix Multipliers

机译:位级和字级矩阵乘法器的FPGA实现比较

获取原文
获取原文并翻译 | 示例

摘要

We have implemented a novel bit-level matrix multiplier on a Xilinx FPGA chip where each processing element does a simple operation of adding three to six bits to generate one partial sum bit and one to two carryout bits. The speedup over word-level is possible because individual bits of a word do not have to be processed as a unit in a bit-level architecture. It is shown in a previous work that bit-level architectures for fixed point applications can be O(log p) times faster than the corresponding word-level architecture where ?is the word length. In this paper we implemented the bit-level matrix multiplier on a Xilinx FPGA chip that is compared to a word-level matrix multiplier composed of highly optimized multiplier and adder macros available in the Xilinx Core generator library. The architecture presented in this paper is even faster than previous ones by breaking the critical path in the dependence graph into half. Our results show that speedup by a factor of 2 can be obtained in practice.
机译:我们已经在Xilinx FPGA芯片上实现了一种新颖的位级矩阵乘法器,其中每个处理元件都执行一个简单的操作,即将三到六位相加以生成一个部分和位和一到两个进位位。在字级上进行加速是可能的,因为字的各个位不必在位级体系结构中作为一个单元进行处理。在先前的工作中表明,定点应用程序的位级体系结构可以比相应的字级体系结构快O(log p)倍,其中?是字长。在本文中,我们在Xilinx FPGA芯片上实现了位级矩阵乘法器,并将其与由Xilinx Core生成器库中提供的高度优化的乘法器和加法器宏组成的字级矩阵乘法器进行了比较。通过将依赖关系图中的关键路径分为两半,本文提出的体系结构比以前的体系结构甚至更快。我们的结果表明,在实践中可以将速度提高2倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号