首页> 外文会议>International conference on neural information processing >Asynchronous, Data-Parallel Deep Convolutional Neural Network Training with Linear Prediction Model for Parameter Transition
【24h】

Asynchronous, Data-Parallel Deep Convolutional Neural Network Training with Linear Prediction Model for Parameter Transition

机译:线性预测模型用于参数转换的异步数据并行深度卷积神经网络训练

获取原文

摘要

Recent studies have revealed that Convolutional Neural Networks requiring vastly many sum-of-product operations with relatively small numbers of parameters tend to exhibit great model performances. Asynchronous Stochastic Gradient Descent provides a possibility of large-scale distributed computation for training such networks. However, asynchrony introduces stale gradients, which are considered to have negative effects on training speed. In this work, we propose a method to predict future parameters during the training to mitigate the drawback of staleness. We show that the proposed method gives good parameter prediction accuracies that can improve speed of asynchronous training. The experimental results on ImageNet demonstrates that the proposed asynchronous training method, compared to a synchronous training method, reduces the training time to reach a certain model accuracy by a factor of 1.9 with 256 GPUs used in parallel.
机译:最近的研究表明,需要大量的乘积和运算且参数数量相对较少的卷积神经网络往往表现出出色的模型性能。异步随机梯度下降为训练此类网络提供了大规模分布式计算的可能性。但是,异步会引入陈旧的梯度,这被认为会对训练速度产生负面影响。在这项工作中,我们提出了一种在训练期间预测未来参数的方法,以减轻过时的缺点。我们表明,该方法提供了良好的参数预测精度,可以提高异步训练的速度。在ImageNet上的实验结果表明,与同步训练方法相比,与256个GPU并行使用时,与同步训练方法相比,所提出的异步训练方法将训练时间减少了达到一定模型精度的1.9倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号