首页> 外文期刊>Journal of supercomputing >Accelerating number theoretic transform in GPU platform for fully homomorphic encryption
【24h】

Accelerating number theoretic transform in GPU platform for fully homomorphic encryption

机译:加速GPU平台的数字理论变换,以全同性恋加密

获取原文
获取原文并翻译 | 示例

摘要

In scientific computing and cryptography, there are many applications that involve large integer multiplication, which is a time-consuming operation. To reduce the computational complexity, number theoretic transform is widely used, wherein the multiplication can be performed in the frequency domain with reduced complexity. However, the speed performance of large integer multiplication is still not satisfactory if the operand size is very large (e.g., more than 100K-bit). In view of that, several researchers had proposed to accelerate the implementation of number theoretic transform using massively parallel GPU architecture. In this paper, we proposed several techniques to improve the performance of number theoretic transform implementation, which is faster than the state-of-the-art work by Dai et al. The proposed techniques include register-based twiddle factors storage and multi-stream asynchronous computation, which leverage on the features offered in new GPU architectures. The proposed number theoretic transform implementation was applied to CMNT fully homomorphic encryption scheme proposed by Coron et al. With the proposed implementation technique, homomorphic multiplications in CMNT take 0.27 ms on GTX1070 desktop GPU and 7.49 ms in Jetson TX1 embedded system, respectively. This shows that the proposed implementation is suitable for practical applications in server environment as well as embedded system.
机译:在科学计算和加密中,许多应用程序涉及大的整数乘法,这是耗时的操作。为了降低计算复杂性,可以广泛使用数量的理论变换,其中可以以减少复杂性的频域在频域中执行乘法。但是,如果操作数大小非常大(例如,超过100k位),大型整数乘法的速度性能仍然不令人满意。鉴于此,有几位研究人员已经提出了使用大规模平行的GPU架构加速数字理论变换的实施。在本文中,我们提出了几种技术来提高数量理论变换实施的性能,这比Dai等人的最先进的工作更快。所提出的技术包括基于寄存器的纺鞋因子存储和多流异步计算,它利用新GPU架构中提供的功能。所提出的数字理论变换实施应用于Coron等人提出的CMNT全同性全相治方案。通过提出的实现技术,CMNT中的同态乘法分别在GTX1070桌面GPU上为Jetson TX1嵌入式系统中的7.49ms占0.27毫秒。这表明所提出的实现适用于服务器环境中的实际应用以及嵌入式系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号