首页> 外文期刊>Parallel Computing >A computational-graph partitioning method for training memory-constrained DNNs
【24h】

A computational-graph partitioning method for training memory-constrained DNNs

机译:用于训练内存受限DNN的计算图分区方法

获取原文
获取原文并翻译 | 示例

摘要

Many state-of-the-art Deep Neural Networks (DNNs) have substantial memory requirements. Limited device memory becomes a bottleneck when training those models. We propose ParDNN, an automatic, generic, and non-intrusive partitioning strategy for DNNs that are represented as computational graphs. ParDNN decides a placement of DNN's underlying computational graph operations across multiple devices so that the devices' memory constraints are met and the training time is minimized. ParDNN is completely independent of the deep learning aspects of a DNN. It requires no modification neither at the model nor at the systems level implementation of its operation kernels. ParDNN partitions DNNs having billions of parameters and hundreds of thousands of operations in seconds to few minutes. Our experiments with TensorFlow on 16 GPUs demonstrate efficient training of 5 very large models while achieving superlinear scaling for both the batch size and training throughput. ParDNN either outperforms or qualitatively improves upon the related work.
机译:许多最先进的深神经网络(DNN)具有大量的内存要求。限量设备存储器在培训那些模型时成为瓶颈。我们为DNN提出普德纳,自动,通用和非侵入式分区策略,其表示为计算图表。 Pardnn决定DNN底层的底层计算图操作跨多个设备的位置,以便满足设备的内存约束,并且训练时间最小化。 Pardnn完全独立于DNN的深层学习方面。它无需修改模型,也不需要修改其操作内核的系统级别实现。 Pardnn Partitions Dnns具有数十亿个参数和数百秒的操作,以几秒钟为单位。我们对16个GPU的Tensorflow的实验展示了5种非常大型型号的高效训练,同时实现批量大小和训练吞吐量的超连线缩放。 Pardnn无论是优于还是质量地改善相关工作。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号