首页> 外国专利> SYSTEM AND METHOD FOR DYNAMIC SCHEDULING OF DISTRIBUTED DEEP LEARNING TRAINING JOBS

SYSTEM AND METHOD FOR DYNAMIC SCHEDULING OF DISTRIBUTED DEEP LEARNING TRAINING JOBS

机译:分布式深层学习训练作业动态调度的系统和方法

摘要

A scheduling algorithm for scheduling training of deep neural network (DNN) weights on processing units identifies a next job to provisionally assign a processing unit (PU) based on a doubling heuristic. The doubling heuristic makes use of an estimated number of training sets needed to complete training of weights for a given job and/or a training speed function which indicates how fast the weights are converging. The scheduling algorithm solves a problem of efficiently assigning PUs when multiple DNN weight data structures must be trained efficiently. In some embodiments, the training of the weights uses a ring-based message passing architecture. In some embodiments, performance using a nested loop approach or nested loop fashion is provided. In inner iterations of the nested loop, PUs are scheduled and jobs are launched or re-started. In outer iterations of the nested loop, jobs are stopped, parameters are updated and the inner iteration is re-entered.
机译:用于对训练单元上的深度神经网络(DNN)权重进行训练的调度算法,基于加倍启发法,识别下一个作业以临时分配处理单元(PU)。双重启发法利用完成给定工作的权重训练所需的估计数量的训练集和/或指示权重收敛有多快的训练速度函数。该调度算法解决了必须高效训练多个DNN权重数据结构时有效分配PU的问题。在一些实施例中,权重的训练使用基于环的消息传递架构。在一些实施例中,提供了使用嵌套循环方法或嵌套循环方式的性能。在嵌套循环的内部迭代中,调度了PU,并启动或重新启动了作业。在嵌套循环的外部迭代中,作业被停止,参数被更新,内部迭代被重新输入。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号