Accelerating Training of Deep Neural Networks on GPU using CUDA

D.T.V. Dharmajee Rao; K.V. Ramana

首页> 外文期刊>International Journal of Intelligent Systems and Applications >Accelerating Training of Deep Neural Networks on GPU using CUDA

【24h】

Accelerating Training of Deep Neural Networks on GPU using CUDA

机译：使用CUDA加速GPU上的深度神经网络训练

获取原文

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The development of fast and efficient training algorithms for Deep Neural Networks has been a subject of interest over the past few years because the biggest drawback of Deep Neural Networks is enormous cost in computation and large time is consumed to train the parameters of Deep Neural Networks. This aspect motivated several researchers to focus on recent advancements of hardware architectures and parallel programming models and paradigms for accelerating the training of Deep Neural Networks. We revisited the concepts and mechanisms of typical Deep Neural Network training algorithms such as Backpropagation Algorithm and Boltzmann Machine Algorithm and observed that the matrix multiplication constitutes major portion of the work-load for the Deep Neural Network training process because it is carried out for a huge number of times during the training of Deep Neural Networks. With the advent of many-core GPU technologies, a matrix multiplication can be done very efficiently in parallel and this helps a lot training a Deep Neural Network not consuming time as it used to be a few years ago. CUDA is one of the high performance parallel programming models to exploit the capabilities of modern many-core GPU systems. In this paper, we propose to modify Backpropagation Algorithm and Boltzmann Machine Algorithm with CUDA parallel matrix multiplication and test on many-core GPU system. Finally we discover that the planned strategies achieve very quick training of Deep Neural Networks than classic strategies.

机译：在过去的几年中，针对深度神经网络的快速有效训练算法的开发一直是人们关注的主题，因为深度神经网络的最大缺点是计算成本巨大，并且花费大量时间来训练深度神经网络的参数。这方面促使一些研究人员将注意力集中在硬件体系结构和并行编程模型及范例的最新进展上，以加速对深度神经网络的训练。我们重新审视了典型的深度神经网络训练算法（例如，反向传播算法和Boltzmann机器算法）的概念和机制，并观察到矩阵乘法构成了深度神经网络训练过程工作量的主要部分，因为它是为执行大量任务而进行的。深度神经网络训练期间的次数。随着多核GPU技术的出现，矩阵乘法可以非常高效地并行完成，这有助于大量训练深度神经网络，而无需像几年前那样花费时间。 CUDA是利用现代多核GPU系统功能的高性能并行编程模型之一。本文提出用CUDA并行矩阵乘法修改Backpropagation算法和Boltzmann机器算法，并在多核GPU系统上进行测试。最终，我们发现，与经典策略相比，计划策略可对深度神经网络进行非常快速的训练。

著录项

来源
《International Journal of Intelligent Systems and Applications 》 |2019年第5期| 共9页
作者
D.T.V. Dharmajee Rao; K.V. Ramana;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类人工智能理论 ;
关键词
Deep Neural NetworksMatrix multiplicationCUDAMany-core GPU systems;

机译：深度神经网络矩阵乘法CUDA许多核心GPU系统;

相似文献

外文文献
中文文献
专利

1. CAN FPGAs BEAT GPUs IN ACCELERATING NEXT-GENERATION DEEP NEURAL NETWORKS? [J] . Scientific Computing World . 2019 ,第166期

机译：FPGA能否加速下一代深层神经网络中的GPU？
2. SuperNeurons: Dynamic GPU Memory Management for Training Deep Neural Networks [J] . Linnan Wang, Jinmian Ye, Yiyang Zhao, ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2018 ,第1期

机译：Superneurons：用于训练深层神经网络的动态GPU内存管理
3. Ensemble-Average Representation of Pt Clusters in Conditions of Catalysis Accessed through GPU Accelerated Deep Neural Network Fitting Global Optimization [J] . Zhai Huanchen, Alexandrova Anastassia N. Journal of chemical theory and computation: JCTC . 2016 ,第12期

机译：通过GPU加速的深度神经网络拟合全局优化访问的催化条件下Pt团簇的平均积分表示
4. Accelerating Deep Neural Network Training for Action Recognition on a Cluster of GPUs [C] . Guojing Cong, Giacomo Domeniconi, Joshua Shapiro, International Symposium on Computer Architecture and High Performance Computing . 2018

机译：加速深度神经网络训练以在GPU集群上进行动作识别
5. Kernel Mechanisms for Efficient GPU Accelerated Deep Neural Network Inference on Embedded Devices [D] . Nigam, Hemant. 2018

机译：高效GPU的内核机制加速了对嵌入式设备的深神经网络推断
6. Accelerating the Finite-Element Method for Reaction-Diffusion Simulations on GPUs with CUDA [O] . Hedi Sellami, Leo Cazenille, Teruo Fujii, 2020

机译：加速CUDA对GPU反应扩散模拟的有限元法
7. Accelerating Training of Deep Neural Networks on GPU using CUDA [O] . D.T.V. Dharmajee Rao, K.V. Ramana 2019

机译：使用CUDA加速GPU对GPU深层神经网络的培训

Accelerating Training of Deep Neural Networks on GPU using CUDA

摘要

著录项

相似文献

相关主题

期刊订阅