首页> 外文会议>International Conference on Information and Communication Technology for Sustainable Development >GPU Accelerated Tensor Computation of Hadamard Product for Machine Learning Applications
【24h】

GPU Accelerated Tensor Computation of Hadamard Product for Machine Learning Applications

机译:GPU加速张于机器学习应用的Hadamard产品的张量计算

获取原文

摘要

The computation on Graphics Processing Unit (GPU) has come out as a new cost-effective parallel computing paradigm for high performance computing that makes possible to process large scale data in parallel. GPU is designed to perform complex mathematical and geometric tasks which are primarily used for 3D graphics related functions. It is also possible to use GPU for non-graphics or general-purpose computation, called General Purpose Computing on GPU (GPGPU), a sub-discipline of High-Performance Computing (HPC). The use of GPU, along with CPU to accelerate more complex scientific, engineering and mathematical tasks is known as GPU Accelerated Computing. In this paper, we propose an efficient tensor computation for Hadamard Product (HP) which is directly applied in machine learning applications especially in Long Short-Term Memory (LSTM). The HP computation becomes complex when higher order tensors with millions of data is considered. Therefore, the only CPU-based traditional serial operation becomes tedious and inefficient. The contribution of this paper is in two fold; first we have developed efficient algorithms for higher order tensors by dimension conversion. Then we apply the algorithm in GPU to speed up the computation. To apply in GPU, we develop efficient partitioning scheme of higher order tensors. We have used CUDA (Compute Unified Device Architecture) C programming model developed by NVIDIA to implement the algorithm. We compared these algorithms with Traditional Multidimensional Array (TMA) based algorithm and found improved results.
机译:图形处理单元(GPU)的计算已成为高性能计算的新成本有效的并行计算范例,使得可以并行处理大规模数据。 GPU旨在执行复杂的数学和几何任务,主要用于3D图形相关功能。还可以使用GPU进行非图形或通用计算,称为GPU(GPGPU)的通用计算,高性能计算的子学科(HPC)。使用GPU,以及CPU加速更复杂的科学,工程和数学任务被称为GPU加速计算。在本文中,我们提出了一种用于Hadamard产品(HP)的有效张量计算,该产品直接应用于机器学习应用中,特别是在长短短期内存(LSTM)中。当考虑具有数百万数据的高阶张量时,HP计算变得复杂。因此,唯一基于CPU的传统串行操作变得繁琐且效率低下。本文的贡献有两倍;首先,我们通过尺寸转换为高阶张量开发了高效的算法。然后我们将算法应用于GPU以加快计算。申请于GPU,我们开发高阶张量的高效分区方案。我们使用了由NVIDIA开发的CUDA(计算统一设备架构)C编程模型来实现算法。我们将这些算法与基于传统的多维阵列(TMA)的算法进行了比较,发现了改进的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号