首页> 外文期刊>Computers, IEEE Transactions on >A Parallel Implementation of Montgomery Multiplication on Multicore Systems: Algorithm, Analysis, and Prototype
【24h】

A Parallel Implementation of Montgomery Multiplication on Multicore Systems: Algorithm, Analysis, and Prototype

机译:蒙哥马利乘法在多核系统上的并行实现:算法,分析和原型

获取原文
获取原文并翻译 | 示例
           

摘要

The Montgomery Multiplication is one of the cornerstones of public-key cryptography, with important applications in the RSA algorithm, in Elliptic-Curve Cryptography, and in the Digital Signature Standard. The efficient implementation of this long-word-length modular multiplication is crucial for the performance of public-key cryptography. Along with the strong momentum of shifting from single-core to multicore systems, we present a parallel-software implementation of the Montgomery multiplication for multicore systems. Our comprehensive analysis shows that the proposed scheme, pSHS, partitions the task in a balanced way so that each core has the same amount of job to do. In addition, we also comprehensively analyze the impact of intercore communication overhead on the performance of pSHS. The analysis reveals that pSHS is high performance, scalable over different number of cores, and stable when the communication latency changes. The analysis also tells us how to set different parameters to achieve the optimal performance. We implemented pSHS on a prototype multicore architecture configured in a Field Programmable Gate Array (FPGA). Compared with the sequential implementation, pSHS accelerates 2,048-bit Montgomery multiplication by 1.97, 3.68, and 6.13 times on, respectively, two-core, four-core, and eight-core architectures with communication latency equal to 100 clock cycles.
机译:蒙哥马利乘法是公钥密码学的基石之一,在RSA算法,椭圆曲线密码术和数字签名标准中具有重要的应用。这种长字长的模块化乘法的有效实现对于公钥密码学的性能至关重要。随着从单核系统向多核系统转变的强劲势头,我们提出了用于多核系统的蒙哥马利乘法的并行软件实现。我们的综合分析表明,所提出的方案pSHS以一种平衡的方式对任务进行了划分,以使每个核心都有相同的工作量。此外,我们还全面分析了内核间通信开销对pSHS性能的影响。分析表明,pSHS具有高性能,可在不同数量的内核上扩展以及在通信延迟发生变化时保持稳定的特性。分析还告诉我们如何设置不同的参数以获得最佳性能。我们在现场可编程门阵列(FPGA)中配置的原型多核体系结构上实现了pSHS。与顺序实现相比,pSHS在通信延迟等于100个时钟周期的两核,四核和八核体系结构上分别将2048位蒙哥马利乘法速度提高了1.97、3.68和6.13倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号