Effectiveness of Moldable and Malleable Scheduling in Deep Learning Tasks

机译：深度学习任务中可塑和可塑调度的有效性

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Research and development of deep learning (DL) applications often involves exhaustive trial-and-error, which demands that shared computational resources, especially GPUs, be efficiently allocated. Most DL tasks are moldable or malleable (i.e., the number of allocated GPUs can be changed before or during execution). However, conventional batch schedulers do not take advantage of DL tasks' moldability/malleability, inhibiting speedup when some GPU resources are unallocated. Another opportunity for speedup is to run multiple tasks concurrently on one GPU, which may improve the overall throughput because a single task does not always fully utilize the GPU's computational resources. We propose designing a batch scheduling system that exploits these opportunities to accelerate DL tasks. As a first step, this study conducts an extensive case study to evaluate the speedup of DL tasks when a scheduler treats them as moldable or malleable. That is, the scheduler adjusts the number of GPUs to be (or already) allocated to a task in response to the fluctuating availability of GPUs. Simulations using our real workload trace show that if the scheduler can allocate 1-4 GPUs to a task or assign 1-4 tasks to a GPU, then the average flow time of moldable/malleable DL tasks is shortened by at least 15.1 %/42.5 %, respectively, compared to a Rigid FCFS schedule in which one GPU is allocated to each task.

机译：深度学习（DL）应用程序的研究和开发通常涉及详尽的反复试验，这要求有效分配共享的计算资源（尤其是GPU）。大多数DL任务是可塑的或可塑的（即可以在执行之前或执行期间更改分配的GPU的数量）。但是，常规的批处理调度程序无法利用DL任务的可成型性/可恶性，从而在未分配某些GPU资源时会抑制加速。另一个提速的机会是在一个GPU上同时运行多个任务，这可能会提高整体吞吐量，因为单个任务并不总是完全利用GPU的计算资源。我们建议设计一个批处理调度系统，以利用这些机会来加速DL任务。作为第一步，本研究进行了广泛的案例研究，以评估调度程序将DL任务视为可塑或可塑时的速度。即，调度器响应于GPU的可用性的波动而调整要（或已经）分配给任务的GPU的数量。使用实际工作负载跟踪进行的模拟显示，如果调度程序可以为任务分配1-4个GPU或为GPU分配1-4个任务，那么可模制/可恶意攻击的DL任务的平均流程时间至少缩短了15.1 \％/与其中每个任务分配一个GPU的刚性FCFS计划相比，分别为42.5 \％。

著录项

来源
《IEEE International Conference on Parallel and Distributed Systems》|2018年|389-398|共10页
会议地点
作者
Ikki Fujiwara; Masahiro Tanaka; Keniiro Taura; Kentaro Torisawa;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Task analysis; Graphics processing units; Artificial neural networks; Parallel processing; Training; Runtime; Scheduling;

机译：任务分析;图形处理单元;人工神经网络;并行处理;培训;运行时;调度;

相似文献

外文文献
中文文献
专利

1. Deep Reinforcement Learning for Delay-Oriented IoT Task Scheduling in SAGIN [J] . Zhou Conghao, Wu Wen, He Hongli, IEEE transactions on wireless communications . 2021,第2期

机译：Sagin延迟导向IOT任务调度的深度加强学习
2. Deep and reinforcement learning for automated task scheduling in large-scale cloud computing systems [J] . Rjoub Gaith, Bentahar Jamal, Wahab Omar Abdel, Concurrency and computation: practice and experience . 2021,第23期

机译：大型云计算系统中自动任务调度的深度和加强学习
3. Energy Efficient Task Scheduling in Fog Environment using Deep Reinforcement Learning Approach [J] . Shashank Swarup, Elhadi M. Shakshuki, Ansar Yasar Procedia Computer Science . 2021,第a期

机译：利用深增强学习方法在雾环境中节能任务调度
4. Effectiveness of Moldable and Malleable Scheduling in Deep Learning Tasks [C] . Ikki Fujiwara, Masahiro Tanaka, Keniiro Taura, IEEE International Conference on Parallel and Distributed Systems . 2018

机译：深度学习任务中可模塑和可延展性调度的有效性
5. From Fully-Supervised, Single-Task to Scarcely-Supervised, Multi-Task Deep Learning for Medical Image Analysis [D] . ?Imran, Abdullah-Al-Zubaer 2020

机译：从完全监督的单一任务到几乎监督，多任务深度学习进行医学图像分析
6. Deep Reinforcement Learning-Based Task Scheduling in IoT Edge Computing [O] . Shuran Sheng, Peng Chen, Zhimin Chen, 2021

机译：基于深度加强学习的IOT Edge Computing任务调度
7. Energy-Efficient Static Scheduling of Streaming Task Collections with Malleable Tasks [O] . Christoph Kessler, Patrick Eitschberger 2015

机译：具有可变任务的流任务集合的节能静态调度

Effectiveness of Moldable and Malleable Scheduling in Deep Learning Tasks

摘要

著录项

相似文献

相关主题

期刊订阅