首页>
外国专利>
JOB SCHEDULING METHOD FOR DISTRIBUTED DEEP LEARNING OVER A SHARED GPU CLUSTER AND COMPUTER-READABLE RECORDING MEDIUM
JOB SCHEDULING METHOD FOR DISTRIBUTED DEEP LEARNING OVER A SHARED GPU CLUSTER AND COMPUTER-READABLE RECORDING MEDIUM
展开▼
机译:在共享GPU集群和计算机可读记录介质上分布式深度学习的作业调度方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
The task scheduling method according to the present invention is a task scheduling method for a shared GPU cluster for learning a deep learning model, a determination step of determining the GPU quota for a plurality of tasks, a case of adding one GPU for a plurality of tasks An estimation step of estimating the learning rate, an extraction step of extracting a job with the largest increase in speedup based on each estimated learning rate for a plurality of jobs, an allocation step of adding one GPU quota to the extracted jobs, and an iterative step of sequentially repeating the estimating step, the extracting step, and the allocating step until at least one GPU is allocated for all of the plurality of tasks, wherein the speedup is (learning rate when using one GPU)/(GPU learning rate upon additional assignment). According to the distributed deep learning task scheduling method for a shared GPU cluster according to the present invention and a computer-readable recording medium recording the same, the GPU cluster is efficiently distributed by distributing it to utilize the entire GPU as efficiently as possible based on the improvement of the learning speed of the deep learning model. can be managed with In particular, when scheduling a plurality of tasks for training a deep learning model on a GPU cluster at the same time, it has the technical effect of minimizing the average task completion time and all overall completion times.
展开▼