Exploiting potential of deep neural networks by layer-wise fine-grained parallelism

Jiang Wenbin; Zhang Yangsong; Liu Pai; Peng Jing; Yang Laurence T.; Ye Geyan; Jin Hai

首页> 外文期刊>Future generation computer systems >Exploiting potential of deep neural networks by layer-wise fine-grained parallelism

【24h】

Exploiting potential of deep neural networks by layer-wise fine-grained parallelism

机译：通过分层细粒度并行性挖掘深层神经网络的潜力

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Deep neural networks (DNNs) have become more and more important for big data analysis. They usually use data parallelism or model parallelism for extreme scale computing. However, the two approaches realize the performance improvement mainly by using coarse-grained parallelization schemes. Neither can fully exploit the potentials of the parallelism of many-core systems (such as GPUs) for neural network models. Here, a new fine grained parallelism strategy (named FiLayer) is presented based on layer-wise parallelization. It has two components: inter-layer parallelism and intra-layer parallelism. The inter-layer parallelism makes several neighboring layers be processed by using a pipeline manner in a network model. For intra-layer parallelism, the operations in one layer are separated into several parts and processed concurrently. To implement above fine-grained parallelism methods, CUDA streams are used. A mathematical analysis is presented for the influence of fragment number on performance of the inter-layer parallelism, and also an analysis for the influence of CUDA stream number on the performance of the intra-layer parallelism is given. The proposed approach is realized based on Caffe. Some representative datasets including CIFAR100 and ImageNet, are applied for experiments. The evaluation results show that it can help Caffe realize remarkable speedups, which makes much sense to big data analysis. (C) 2019 Elsevier B.V. All rights reserved.

机译：深度神经网络（DNN）在大数据分析中变得越来越重要。他们通常使用数据并行性或模型并行性进行极端规模的计算。但是，这两种方法主要是通过使用粗粒度并行化方案来实现性能的提高。两者都无法充分利用神经网络模型的多核系统（例如GPU）的并行化潜力。在此，基于分层并行化提出了一种新的细粒度并行化策略（称为FiLayer）。它具有两个组成部分：层间并行性和层内并行性。层间并行性使网络模型中使用流水线方式处理几个相邻的层。对于层内并行性，将一层中的操作分为几部分并同时进行处理。为了实现上述细粒度的并行方法，使用了CUDA流。对分片数对层间并行性的影响进行了数学分析，并对CUDA流数对层内并行性的影响进行了分析。所提出的方法是基于Caffe实现的。一些代表性的数据集（包括CIFAR100和ImageNet）被用于实验。评估结果表明，它可以帮助Caffe实现显着的加速，这对大数据分析非常有意义。（C）2019 Elsevier B.V.保留所有权利。

著录项

来源
《Future generation computer systems》 |2020年第1期|210-221|共12页
作者
Jiang Wenbin; Zhang Yangsong; Liu Pai; Peng Jing; Yang Laurence T.; Ye Geyan; Jin Hai;
展开▼
作者单位

Huazhong Univ Sci & Technol Natl Engn Res Ctr Big Data Technol & Syst Sch Comp Sci & Technol Serv Comp Technol & Syst Lab Cluster & Grid Comp Wuhan 430074 Hubei Peoples R China;

Huazhong Univ Sci & Technol Sch Comp Sci & Technol Cyber Phys Social Syst Lab Wuhan 430074 Hubei Peoples R China|St Francis Xavier Univ Dept Comp Sci Antigonish NS Canada;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Deep learning; Fine-grained parallelism; CUDA stream;

机译：深度学习;细粒度的并行性;CUDA流;

相似文献

外文文献
中文文献
专利

1. Neural Splines: Exploiting Parallelism for Function Approximation Using Modular Neural Networks [J] . I.G.Tsoulos, I.E.Lagaris, A.Likas Neural, Parallel & Scientific Computations . 2005,第2期

机译：神经样条：使用模块化神经网络利用并行性进行函数逼近
2. Divide and Slide: Layer-Wise Refinement for Output Range Analysis of Deep Neural Networks [J] . Huang Chao, Fan Jiameng, Chen Xin, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems . 2020,第11期

机译：划分和幻灯片：深神经网络输出范围分析的层面细化
3. Handwritten Devanagari Character Recognition Using Layer-Wise Training of Deep Convolutional Neural Networks and Adaptive Gradient Methods [J] . Mahesh Jangid, Sumit Srivastava Journal of Imaging . 2018,第2期

机译：深度卷积神经网络的分层明智训练和自适应梯度法的手写体梵文字符识别
4. FiLayer: A Novel Fine-Grained Layer-Wise Parallelism Strategy for Deep Neural Networks [C] . Wenbin Jiang, Yangsong Zhang, Pai Liu, International conference on artificial neural networks . 2018

机译：FiLayer：一种适用于深层神经网络的新颖的细粒度分层并行策略
5. CramNet: Layer-Wise Deep Neural Network Compression with Knowledge Transfer from a Teacher Network [D] . Hoffman, Jon. 2018

机译：克拉姆网：层面深度神经网络压缩，具有来自教师网络的知识转移
6. Differential Evolution Based Layer-Wise Weight Pruning for Compressing Deep Neural Networks [O] . Tao Wu, Xiaoyang Li, Deyun Zhou, 2021

机译：基于差分进化的深层神经网络的层面重量修剪
7. Layer-Wise Interpretation of Deep Neural Networks using Identity Initialization [O] . Shohei Kubota, Hideaki Hayashi, Tomohiro Hayase, 2021

机译：使用身份初始化的深神经网络的层面解释
8. Exploiting Hidden Layer Responses of Deep Neural Networks for Language Recognition. [R] . Li, R., Mallidi, S. H., Burget, L., 2016

机译：利用深层神经网络隐藏层响应进行语言识别。

Exploiting potential of deep neural networks by layer-wise fine-grained parallelism

摘要

著录项

相似文献

相关主题

期刊订阅