...
首页> 外文期刊>International Journal of Networking and Computing >High-Performance Symmetric Block Ciphers on Multicore CPU and GPUs
【24h】

High-Performance Symmetric Block Ciphers on Multicore CPU and GPUs

机译:多核CPU和GPU上的高性能对称块密码

获取原文
           

摘要

As the data protection with encryption becomes important day by day, the encryption processing using General Purpose computation on a Graphic Processing Unit (GPGPU) has been noticed as one of the methods to realize high-speed data protection technology. GPUs have evolved in recent years into powerful parallel computing devices, with a high cost-performance ratio. However, many factors affect GPU performance. In earlier work to gain higher AES performance using GPGPU in various ways, we obtained the following two technical viewpoints: (1) 16 Bytes/Thread is the best granularity (2) Extended key and substitution table stored in shared memory and plaintext stored in register are the best memory allocation style. However, AES is not the only cipher algorithm widely used in the real world. For this reason, this study was undertaken to test the hypothesis that these two findings are applicable to implementation of other symmetric block ciphers on two generation of GPU. In this study, we targeted five 128-bit symmetric block ciphers, AES, Camellia, CIPHERUNICORN-A, Hierocrypt-3, and SC2000, from an e-government recommended ciphers list by the CRYPTography Research and Evaluation Committees (CRYPTREC) in Japan. We evaluated the performance of these five symmetric block ciphers on the machine including a 4-core CPU and each GPU using three method: (A) throughput without data transfer, (B) throughput with data transfer and overlapping encryption processing on GPU, (C) throughput with data transfer and non-overlapping encryption processing on GPU. Results demonstrate that the throughput of implementation of SC2000 in method (A) on Tesla C2050 achieved extremely high 73.4 Gbps. Additionally, the throughput obtained using methods (B) and (C) deteriorated to 33.4 Gbps and 18.3 Gbps, respectively. Method (B) showed effective throughput with an approximately 4.7 times higher speed compared to that obtained when using 8 threads on a 4-core CPU.
机译:随着加密的数据保护日益重要,使用图形处理单元(GPGPU)上的通用计算进行的加密处理已成为实现高速数据保护技术的方法之一。近年来,GPU已发展成为功能强大的并行计算设备,具有很高的性价比。但是,许多因素都会影响GPU性能。在以各种方式使用GPGPU获得更高AES性能的早期工作中,我们获得了以下两个技术观点:(1)16字节/线程是最佳粒度(2)扩展密钥和替换表存储在共享内存中,明文存储在寄存器中是最好的内存分配方式。但是,AES并不是在现实世界中广泛使用的唯一加密算法。因此,本研究旨在检验以下假设:这两个发现适用于在第二代GPU上实施其他对称块密码。在这项研究中,我们针对日本密码研究与评估委员会(CRYPTREC)的电子政务推荐密码列表,针对了五个128位对称块密码AES,山茶花,CIPHERUNICORN-A,Hierocrypt-3和SC2000。我们使用三种方法评估了这五个对称块密码在机器上的性能,包括4核CPU和每个GPU,使用三种方法:(A)不进行数据传输的吞吐量,(B)在GPU上进行数据传输和重叠加密处理的吞吐量,(C )在GPU上进行数据传输和非重叠加密处理的吞吐量。结果表明,在特斯拉C2050上的方法(A)中实施SC2000的吞吐量达到了极高的73.4 Gbps。此外,使用方法(B)和(C)获得的吞吐量分别下降到33.4 Gbps和18.3 Gbps。与在4核CPU上使用8个线程时相比,方法(B)显示了有效的吞吐量,速度大约高4.7倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号