首页> 外国专利> ASYNCHRONOUS GRADIENT AVERAGING DISTRIBUTED STOCHASTIC GRADIENT DESCENT

ASYNCHRONOUS GRADIENT AVERAGING DISTRIBUTED STOCHASTIC GRADIENT DESCENT

机译:异步梯度平均分布随机梯度下降

摘要

A system for distributed training of a machine learning model over a plurality of computing nodes, comprising a server connected to a plurality of computing nodes and configured to control a training of a machine learning model in a plurality of training iterations. Each of the training iterations comprising: instructing each of the computing nodes to train a respective local copy of the machine learning model by locally computing a respective one of a plurality of cumulative gradients each including one or more gradients, obtaining the cumulative gradients from each of the computing nodes and creating an updated machine learning model by merging the machine learning model with an aggregated value of the cumulative gradients. Wherein during the obtaining and creating phases, one or more of the computing nodes compute a new respective cumulative gradient that is merged with the machine learning model in a following training iteration.
机译:一种用于在多个计算节点上分布式训练机器学习模型的系统,其包括连接到多个计算节点并被配置为在多个训练迭代中控制机器学习模型的训练的服务器。每个训练迭代包括:指示每个计算节点,以通过本地计算多个累积梯度中的各个梯度来训练机器学习模型的各个本地副本,每个累积梯度包括一个或多个梯度,从每个累积节点获得累积梯度。计算节点并通过将机器学习模型与累积梯度的聚合值合并来创建更新的机器学习模型。其中,在获取和创建阶段,一个或多个计算节点计算新的各自的累积梯度,并在随后的训练迭代中将其与机器学习模型合并。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号