GPU Accelerated Tensor Computation of Hadamard Product for Machine Learning Applications

机译：GPU加速张于机器学习应用的Hadamard产品的张量计算

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The computation on Graphics Processing Unit (GPU) has come out as a new cost-effective parallel computing paradigm for high performance computing that makes possible to process large scale data in parallel. GPU is designed to perform complex mathematical and geometric tasks which are primarily used for 3D graphics related functions. It is also possible to use GPU for non-graphics or general-purpose computation, called General Purpose Computing on GPU (GPGPU), a sub-discipline of High-Performance Computing (HPC). The use of GPU, along with CPU to accelerate more complex scientific, engineering and mathematical tasks is known as GPU Accelerated Computing. In this paper, we propose an efficient tensor computation for Hadamard Product (HP) which is directly applied in machine learning applications especially in Long Short-Term Memory (LSTM). The HP computation becomes complex when higher order tensors with millions of data is considered. Therefore, the only CPU-based traditional serial operation becomes tedious and inefficient. The contribution of this paper is in two fold; first we have developed efficient algorithms for higher order tensors by dimension conversion. Then we apply the algorithm in GPU to speed up the computation. To apply in GPU, we develop efficient partitioning scheme of higher order tensors. We have used CUDA (Compute Unified Device Architecture) C programming model developed by NVIDIA to implement the algorithm. We compared these algorithms with Traditional Multidimensional Array (TMA) based algorithm and found improved results.

机译：图形处理单元（GPU）的计算已成为高性能计算的新成本有效的并行计算范例，使得可以并行处理大规模数据。 GPU旨在执行复杂的数学和几何任务，主要用于3D图形相关功能。还可以使用GPU进行非图形或通用计算，称为GPU（GPGPU）的通用计算，高性能计算的子学科（HPC）。使用GPU，以及CPU加速更复杂的科学，工程和数学任务被称为GPU加速计算。在本文中，我们提出了一种用于Hadamard产品（HP）的有效张量计算，该产品直接应用于机器学习应用中，特别是在长短短期内存（LSTM）中。当考虑具有数百万数据的高阶张量时，HP计算变得复杂。因此，唯一基于CPU的传统串行操作变得繁琐且效率低下。本文的贡献有两倍;首先，我们通过尺寸转换为高阶张量开发了高效的算法。然后我们将算法应用于GPU以加快计算。申请于GPU，我们开发高阶张量的高效分区方案。我们使用了由NVIDIA开发的CUDA（计算统一设备架构）C编程模型来实现算法。我们将这些算法与基于传统的多维阵列（TMA）的算法进行了比较，发现了改进的结果。

著录项

来源
《International Conference on Information and Communication Technology for Sustainable Development》|2021年|1-5|共5页
会议地点
作者
K. M. Azharul Hasan; Sagar Chakraborty;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Machine learning algorithms; Tensors; Graphics processing units; Machine learning; Partitioning algorithms; Acceleration; Task analysis;

机译：机器学习算法;张量;图形处理单元;机器学习;分区算法;加速度;任务分析;

相似文献

外文文献
中文文献
专利

1. Using GPU's to Accelerate Stencil-based Computation Kernels for the Development of Large Scale Scientific Applications on Heterogeneous Systems [J] . Jian Tao, Marek Blazewicz, Steven R. Brandt ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2012,第8期

机译：使用GPU加速基于模板的计算内核，以开发异构系统上的大规模科学应用程序
2. GPU accelerated circuit analysis using machine learning‑based parallel computing model [J] . Shital V. Jagtap, Y. S. Rao SN Applied Sciences . 2020,第5期

机译：使用基于机器学习的并行计算模型进行GPU加速电路分析
3. GPU-Accelerated Parallel Hierarchical Extreme Learning Machine on Flink for Big Data [J] . Cen Chen, Kenli Li, Aijia Ouyang, IEEE Transactions on Systems, Man, and Cybernetics . 2017,第10期

机译：用于大数据的Flink上的GPU加速的并行分层极端学习机
4. Approximate kernel matrix computation on GPUs forlarge scale learning applications [C] . Mohamed E. Hussein, Wael Abd-Almageed International conference on Supercomputing . 2009

机译：适用于大规模学习应用的GPU上的近似内核矩阵计算
5. GPGPU-Based Fast Counting in Machine Learning Applications [D] . Greenbaum, Marc. 2019

机译：机器学习应用中基于GPGPU的快速计数
6. GPU-Accelerated Machine Learning Inference as a Service for Computing in Neutrino Experiments [O] . Michael Wang, Tingjun Yang, Maria Acosta Flechas, 2020

机译：GPU加速机器学习推断作为中微子实验计算的服务
7. Accelerated DEVS Simulation Using Collaborative Computation on Multi-Cores and GPUs for Fire-Spreading IoT Sensing Applications [O] . Seongseop Kim, Jeonghun Cho, Daejin Park 2018

机译：利用多核和GPU的加速DEVS仿真，用于传播IOT传感应用

GPU Accelerated Tensor Computation of Hadamard Product for Machine Learning Applications

摘要

著录项

相似文献

相关主题

期刊订阅