首页> 外文会议>IEEE International Parallel and Distributed Processing Symposium >Are Static Schedules so Bad? A Case Study on Cholesky Factorization
【24h】

Are Static Schedules so Bad? A Case Study on Cholesky Factorization

机译:静态时间表这么糟糕吗? Cholesky分解的案例研究

获取原文

摘要

Our goal is to provide an analysis and comparison of static and dynamic strategies for task graph scheduling on platforms consisting of heterogeneous and unrelated resources, such as GPUs and CPUs. Static scheduling strategies, that have been used for years, suffer several weaknesses. First, it is well known that underlying optimization problems are NP-Complete, what limits the capability of finding optimal solutions to small cases. Second, parallelism inside processing nodes makes it difficult to precisely predict the performance of both communications and computations, due to shared resources and co-scheduling effects. Recently, to cope with this limitations, many dynamic task-graph based runtime schedulers (StarPU, StarSs, QUARK, PaRSEC) have been proposed. Dynamic schedulers base their allocation and scheduling decisions on the one side on dynamic information such as the set of available tasks, the location of data and the state of the resources and on the other hand on static information such as task priorities computed from the whole task graph. Our analysis is deep but we concentrate on a single kernel, namely Cholesky factorization of dense matrices on platforms consisting of GPUs and CPUs. This application encompasses many important characteristics in our context. Indeed, it involves 4 different kernels (POTRF, TRSM, SYRK and GEMM) whose acceleration ratios on GPUs are strongly different (from 2.3 for POTRF to 29 for GEMM) and it consists in a phase where the number of available tasks if large, where the careful use of resources is critical, and in a phase with few tasks available, where the choice of the task to be executed is crucial. In this paper, we analyze the performance of static and dynamic strategies and we propose a set of intermediate strategies, by adding more static (resp. dynamic) features into dynamic (resp. static) strategies. Our conclusions are somehow unexpected in the sense that we prove that static-based strategies are very efficient, even in a context where performance estimations are not very good.
机译:我们的目标是提供关于由异构和无关资源组成的平台上的任务图表调度的静态和动态策略的分析和比较,例如GPU和CPU。已经使用的静态调度策略遭受了几个弱点。首先,众所周知,基本的优化问题是NP-完成的,是什么限制了对小案件找到最佳解决方案的能力。其次,由于共享资源和共调度效果,因此难以精确地预测通信和计算的性能。最近,为了应对这个限制,已经提出了许多基于动态的任务图的运行时调度员(Starpu,Stars,Quark,Parsec)。动态调度器基于动态信息的一侧基于动态信息的分配和调度决策,例如可用任务集合,数据的位置和资源状态,另一方面,静态信息(如从整个任务计算的任务优先级)图形。我们的分析深入,但我们专注于一个内核,即由GPU和CPU组成的平台上的密集矩阵的Cholesky分解。本申请包括我们背景中的许多重要特征。实际上,它涉及4个不同的核(Potrf,TRSM,Syrk和Gemm),其GPU上的加速度比强烈不同(来自PotRF的2.3为29到29的Gemm),它在一个阶段组成了可用任务的数量,如果大,则在其中仔细使用资源是至关重要的,并且在一个阶段,有很少的任务,那里可以执行要执行的任务是至关重要的。在本文中,我们分析了静态和动态策略的性能,我们通过在动态(RESP.STIC)策略中添加了更多静态(RESP.NAVERIC)功能来提出一系列中间策略。我们的结论是某种意想不到的,我们证明基于静态的策略非常有效,即使在性能估计不是很好的上下文中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号