首页> 外文期刊>Parallel Processing Letters >HIGH PRECISION INTEGER ADDITION, SUBTRACTION AND MULTIPLICATION WITH A GRAPHICS PROCESSING UNIT
【24h】

HIGH PRECISION INTEGER ADDITION, SUBTRACTION AND MULTIPLICATION WITH A GRAPHICS PROCESSING UNIT

机译:图形处理单元的高精度整数加法,减法和乘法

获取原文
获取原文并翻译 | 示例

摘要

In this paper we evaluate the potential for using an NVIDIA graphics processing unitn(GPU) to accelerate high precision integer multiplication, addition, and subtraction. Thenreported peak vector performance for a typical GPU appears to offer good potential fornaccelerating such a computation. Because of limitations in the on-chip memory, the highncost of kernel launches, and the nature of the architecture’s support for parallelism, wenused a hybrid algorithmic approach to obtain good performance on multiplication. Onnthe GPU itself we adapt the Strassen FFT algorithm to multiply 32KB chunks, while onnthe CPU we adapt the Karatsuba divide-and-conquer approach to optimize applicationnof the GPU’s partial multiplies, which are viewed as “digits” by our implementation ofnKaratsuba. Even with this approach, the result is at best a factor of three increase innperformance, compared with using the GMP package on a 64-bit CPU at a comparablentechnology node. Our implementations of addition and subtraction achieve up to a factornof eight improvement. We identify the issues that limit performance and discuss the likelynimpact of planned advances in GPU architecture.
机译:在本文中,我们评估了使用NVIDIA图形处理单元(GPU)加速高精度整数乘法,加法和减法的潜力。然后,针对典型GPU的报告峰值矢量性能似乎为加速此类计算提供了良好的潜力。由于片上存储器的局限性,内核启动的高昂成本以及该体系结构对并行性的支持性质,他们采用了一种混合算法方法来获得良好的乘法性能。在GPU本身上,我们采用Strassen FFT算法来乘以32KB的块,而在CPU上,我们采用了Karatsuba分治法来优化GPU的部分乘法的应用,nKaratsuba的实现将其视为“数字”。即使使用这种方法,与在可比较技术节点上的64位CPU上使用GMP软件包相比,结果最多也只能提高三倍的性能。我们的加法和减法实现最多提高了八分之一。我们确定了会限制性能的问题,并讨论了计划中的GPU架构改进可能产生的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号