PipePar: A Pipelined Hybrid Parallel Approach for Accelerating Distributed DNN Training

机译：pipepar：一种用于加速分布式DNN训练的流水线混合并行方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Large scale DNN training tasks are exceedingly compute-intensive and time-consuming, which are usually executed on highly-parallel platforms. Data and model parallelization is a common way to speed up the training progress across devices. However, they tend to achieve sub-optimal performance due to the communication overheads and unbalanced load among servers. Recent emerging pipelining solutions mitigate the above issues, incorporating the advantages of data and model parallelism. In this paper, we make a step further towards optimizing the execution of pipelining. We introduce PipePar, a pipeline-parallel DNN training method that provides optimized execution strategies of layer-stacked DNNs. PipePar considers the entire tensor partition space of pipelining and explores potential hybrid parallel configurations of each stage in the pipeline. Additionally, we notice the network heterogeneity between different GPU servers and it is inevitable to transfer tensors with different bandwidths and latency. So, taking into account both computation and communication capacity of different GPU servers, PipePar is intended to find a elastic load distribution strategy at different levels. We evaluate PipePar with a set of real-world DNNs on 4 GPU servers. Our experimental results show that PipePar is able to find an efficient strategy that are up to $2.16imes$ faster than state-of-the-art hybrid parallelization approaches.

机译：大规模的DNN培训任务非常规格 - 密集和耗时，通常在高度平行的平台上执行。数据和模型并行化是加快设备培训培训进度的常用方式。然而，由于服务器之间的通信开销和不平衡负载，它们倾向于实现次优性能。最近的新兴流水线解决方案减轻了上述问题，包括数据和模型并行性的优点。在本文中，我们进一步迈向优化流水线的执行。我们介绍Pipepar，一种管道平行DNN训练方法，提供了层堆叠DNN的优化执行策略。 pipepar认为流水线的整个张量分区空间，并探讨了管道中每个阶段的潜在混合并行配置。此外，我们注意到不同GPU服务器之间的网络异质性，并且不可避免地传输具有不同带宽和延迟的张量。因此，考虑到不同GPU服务器的计算和通信能力，PIPE1p旨在在不同级别找到弹性负载分布策略。我们在4个GPU服务器上评估带有一组真实DNN的PIPEAP1。我们的实验结果表明，Pipepar能够找到一个有效的策略 $ 2.16 times $ 比最先进的混合并行化方法更快。

著录项

来源
《International Conference on Computer Supported Cooperative Work in Design》|2021年|470-475|共6页
会议地点
作者
Jiange Li; Yuchen Wang; Jinghui Zhang; Jiahui Jin; Fang Dong; Lei Qian;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Training; Performance evaluation; Tensors; Computational modeling; Graphics processing units; Data models; Servers;

机译：培训;绩效评估;张量;计算建模;图形处理单元;数据模型;服务器;

相似文献

外文文献
中文文献
专利

1. Speaker Adaptive Training Localizing Speaker Modules in DNN for Hybrid DNN-HMM Speech Recognizers [J] . Tsubasa OCHIAI, Shigeki MATSUDA, Hideyuki WATANABE, IEICE transactions on information and systems . 2016,第10期

机译：混合DNN-HMM语音识别器中DNN中的说话人自适应训练本地化说话人模块
2. Parallel Distributed Hybrid Fuzzy GBML Models With Rule Set Migration and Training Data Rotation [J] . Ishibuchi H., Mihara S., Nojima Y. Fuzzy Systems, IEEE Transactions on . 2013,第2期

机译：具有规则集迁移和训练数据旋转的并行分布式混合模糊GBML模型
3. Accelerating DNN Training in Wireless Federated Edge Learning Systems [J] . Jinke Ren, Guanding Yu, Guangyao Ding IEEE Journal on Selected Areas in Communications . 2021,第1期

机译：加速无线联合边缘学习系统中的DNN培训
4. Accelerating training of DNN in distributed machine learning system with shared memory [C] . Eun-Ji Lim, Shin-Young Ahn, Wan Choi International Conference on Information and Communication Technology Convergence . 2017

机译：具有共享内存的分布式机器学习系统中的DNN加速训练
5. Automated Parallelization to Improve Usability and Efficiency of Distributed Neural Network Training [D] . Grabaskas, Nathaniel J. 2018

机译：自动并行化以提高分布式神经网络训练的可用性和效率
6. HTSFinder: Powerful Pipeline of DNA Signature Discovery by Parallel and Distributed Computing [O] . Ramin Karimi, Andras Hajdu 2016

机译：HTSFinder：通过并行和分布式计算发现DNA签名的强大管道
7. TAPP: DNN Training for Task Allocation through Pipeline Parallelism Based on Distributed Deep Reinforcement Learning [O] . Yingchi Mao, Zijian Tu, Fagang Xi, 2021

机译：TAPP：通过基于分布式深度增强学习的管道并行性任务分配DNN培训

PipePar: A Pipelined Hybrid Parallel Approach for Accelerating Distributed DNN Training

摘要

著录项

相似文献

相关主题

期刊订阅