首页> 外文会议>2010 IEEE International Conference on Cluster Computing >Acceleration of Streamed Tensor Contraction Expressions on GPGPU-Based Clusters
【24h】

Acceleration of Streamed Tensor Contraction Expressions on GPGPU-Based Clusters

机译:基于GPGPU的群集上流式张量压缩表达式的加速

获取原文
获取外文期刊封面目录资料

摘要

Tensor contractions are generalized multidimensional matrix multiplication operations that widely occur in quantum chemistry. Efficient execution of tensor contractions on GPUs requires tackling several challenges to be addressed, including index permutation and small dimension-sizes reducing thread block utilization. In this paper, we present our approach to automatically generate CUDA code to execute tensor contractions on GPUs, including management of data movement between CPU and GPU. GPU-enabled code is generated for the most expensive contractions in CCSD(T), a key coupled cluster method, and incorporated into NW Chem, a popular computational chemistry suite. We demonstrate speedup over a factor of 8.4 using one core per node and over 2.6 when utilizing the entire system using hybrid CPU+GPU solution with 2 GPUs and 5 cores. Finally, we analyze the implementation behavior on future GPU systems.
机译:张量收缩是广义的多维矩阵乘法运算,广泛地发生在量子化学中。要在GPU上有效执行张量收缩,就需要解决一些要解决的难题,包括索引置换和较小的尺寸大小,以减少线程块的利用率。在本文中,我们介绍了自动生成CUDA代码以在GPU上执行张量收缩的方法,包括管理CPU和GPU之间的数据移动。支持GPU的代码是针对CCSD(T)中最昂贵的收缩(一种关键的耦合簇方法)生成的,并已整合到流行的计算化学套件NW Chem中。我们演示了使用每个节点一个内核的速度提高了8.4倍,使用带有2个GPU和5个内核的混合CPU + GPU解决方案使用整个系统时,速度提高了2.6倍。最后,我们分析了未来GPU系统上的实现行为。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号