首页> 外文会议>IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing >Theoretical Scalability Analysis of Distributed Deep Convolutional Neural Networks
【24h】

Theoretical Scalability Analysis of Distributed Deep Convolutional Neural Networks

机译:分布式深度卷积神经网络的理论可扩展性分析

获取原文

摘要

We analyze the asymptotic performance of the training process of deep neural networks (NN) on clusters in order to determine the scalability. For this purpose, i) we assume a data parallel implementation of the training algorithm, which distributes the batches among the cluster nodes and replicates the model; ii) we leverage the roofline model to inspect the performance at the node level, taking into account the floating-point unit throughput and memory bandwidth; and iii) we consider distinct collective communication schemes that are optimal depending on the message size and underlying network interconnection topology. We then apply the resulting performance model to analyze the scalability of several well-known deep convolutional neural networks as a function of the batch size, node floating-point throughput, node memory bandwidth, cluster dimension, and link bandwidth.
机译:为了确定可伸缩性,我们分析了深度神经网络(NN)训练过程在群集上的渐近性能。为此,i)假设训练算法是数据并行实现的,该算法在群集节点之间分配批次并复制模型; ii)考虑到浮点单元的吞吐量和内存带宽,我们利用roofline模型检查节点级别的性能; iii)我们考虑不同的集体通信方案,这些方案根据消息大小和底层网络互连拓扑而最佳。然后,我们使用所得的性能模型来分析多个众所周知的深度卷积神经网络的可伸缩性,该可伸缩性是批处理大小,节点浮点吞吐量,节点内存带宽,集群维度和链接带宽的函数。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号