首页> 外文期刊>Circuits and Systems I: Regular Papers, IEEE Transactions on >A High-Speed, Energy-Efficient Two-Cycle Multiply-Accumulate (MAC) Architecture and Its Application to a Double-Throughput MAC Unit
【24h】

A High-Speed, Energy-Efficient Two-Cycle Multiply-Accumulate (MAC) Architecture and Its Application to a Double-Throughput MAC Unit

机译:高速节能两周期乘积(MAC)架构及其在双通量MAC单元中的应用

获取原文
获取原文并翻译 | 示例

摘要

We propose a high-speed and energy-efficient two-cycle multiply-accumulate (MAC) architecture that supports two's complement numbers, and includes accumulation guard bits and saturation circuitry. The first MAC pipeline stage contains only partial-product generation circuitry and a reduction tree, while the second stage, thanks to a special sign-extension solution, implements all other functionality. Place-and-route evaluations using a 65-nm 1.1-V cell library show that the proposed architecture offers a 31% improvement in speed and a 32% reduction in energy per operation, averaged across operand sizes of 16, 32, 48, and 64 bits, over a reference two-cycle MAC architecture that employs a multiplier in the first stage and an accumulator in the second. When operating the proposed architecture at the lower frequency of the reference architecture the available timing slack can be used to downsize gates, resulting in a 52% reduction in energy compared to the reference. We extend the new architecture to create a versatile double-throughput MAC (DTMAC) unit that efficiently performs either multiply-accumulate or multiply operations for $N$-bit, $1times N/2$-bit, or $2times N/2$-bit operands. In comparison to a fixed-function 32-bit MAC unit, 16-bit multiply-accumulate operations can be executed with 67% higher energy efficiency on a 32-bit DTMAC unit.
机译:我们提出了一种高速节能的两周期乘法累加(MAC)架构,该架构支持二进制补码,并包括累加保护位和饱和电路。 MAC流水线的第一阶段仅包含部分乘积生成电路和归约树,而第二阶段则由于特殊的符号扩展解决方案而实现了所有其他功能。使用65纳米1.1V单元库进行的布局布线评估表明,所建议的体系结构在16、32、48和16位操作数大小上的平均操作速度提高了31%,能耗降低了32%。参考双周期MAC架构上的64位,该架构在第一阶段采用乘法器,在第二阶段采用累加器。当以参考架构的较低频率操作建议的架构时,可用的时序松弛可用于缩小门的尺寸,与参考相比,能耗降低了52%。我们扩展了新的体系结构,以创建通用的双吞吐量MAC(DTMAC)单元,该单元可以高效地执行$ N $位,$ 1x N / 2 $位或$ 2x N / 2 $的乘法累加或乘法运算。位操作数。与固定功能的32位MAC单元相比,可以在32位DTMAC单元上以高67%的能源效率执行16位乘法累加运算。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号