首页> 外文期刊>IEEE Transactions on Computers >Area-Time Efficient Architecture of FFT-Based Montgomery Multiplication
【24h】

Area-Time Efficient Architecture of FFT-Based Montgomery Multiplication

机译:基于FFT的蒙哥马利乘法的时域高效架构

获取原文
获取原文并翻译 | 示例

摘要

The modular multiplication operation is the most time-consuming operation for number-theoretic cryptographic algorithms involving large integers, such as RSA and Diffie-Hellman. Implementations reveal that more than 75 percent of the time is spent in the modular multiplication function within the RSA for more than 1,024-bit moduli. There are fast multiplier architectures to minimize the delay and increase the throughput using parallelism and pipelining. However such designs are large in terms of area and low in efficiency. In this paper, we integrate the fast Fourier transform (FFT) method into the McLaughlin’s framework, and present an improved FFT-based Montgomery modular multiplication (MMM) algorithm achieving high area-time efficiency. Compared to the previous FFT-based designs, we inhibit the zero-padding operation by computing the modular multiplication steps directly using cyclic and nega-cyclic convolutions. Thus, we reduce the convolution length by half. Furthermore, supported by the number-theoretic weighted transform, the FFT algorithm is used to provide fast convolution computation. We also introduce a general method for efficient parameter selection for the proposed algorithm. Architectures with single and double butterfly structures are designed obtaining low area-latency solutions, which we implemented on Xilinx Virtex-6 FPGAs. The results show that our work offers a better area-latency efficiency compared to the state-of-the-art FFT-based MMM architectures from and above 1,024-bit operand sizes. We have obtained area-latency efficiency improvements up to 50.9 percent for 1,024-bit, 41.9 percent for 2,048-bit, 37.8 percent for 4,096-bit and 103.2 percent for 7,680-bit operands. Furthermore, the operating latency is also outperformed with high clock frequency for length-64 transform and above.
机译:对于涉及大整数的数论密码算法(例如RSA和Diffie-Hellman),模块化乘法运算是最耗时的运算。实施显示,RSA的模块化乘法功能花费了超过75%的时间以实现1,024位以上的模数。有快速乘法器体系结构可使用并行性和流水线技术来最大程度地减少延迟并提高吞吐量。但是,这种设计的面积大,效率低。在本文中,我们将快速傅立叶变换(FFT)方法集成到McLaughlin的框架中,并提出了一种改进的基于FFT的蒙哥马利模块化乘法(MMM)算法,该算法可实现较高的时空效率。与以前的基于FFT的设计相比,我们通过直接使用循环和负循环卷积计算模块化乘法步骤来抑制零填充操作。因此,我们将卷积长度减少了一半。此外,在数论加权变换的支持下,FFT算法用于提供快速卷积计算。我们还为提出的算法介绍了一种有效的参数选择方法。设计具有单蝶形和双蝶形结构的架构以获得低面积延迟解决方案,我们在Xilinx Virtex-6 FPGA上实现了该方案。结果表明,与1,024位操作数大小及以上的基于FFT的最新MMM架构相比,我们的工作提供了更好的区域延迟效率。我们已将1,024位的区域延迟效率提高了50.9%,将2,048位的区域延迟效率提高了41.9%,将4,096位的区域延迟提高了37.8%,将7,680位的操作数提高了103.2%。此外,对于长度为64或更高的转换,其运行等待时间的时钟频率也较高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号