【24h】

Autotuning of a Cut-Off for Task Parallel Programs

机译:自动调整任务并行程序的临界值

获取原文
获取外文期刊封面目录资料

摘要

A task parallel programming model is regarded as one of the promising parallel programming models with dynamic load balancing. Since this model supports hierarchical parallelism, it is suitable for parallel divide-and-conquer algorithms. Most naive divide-and-conquer task parallel programs, however, suffer from a high tasking overhead because they tend to create too fine-grained tasks. There are two key idea to enhance the performance of such a program: serializing a task in a cut-off condition which is a tradeoff between decrease of concurrency and parallelization overheads, and applying effective transformations for the task in the condition. Both are sensitive to algorithm features, rendering optimization solely with a compiler ineffective in some cases. To address this problem, we proposed an autotuning framework for divide-and-conquer task parallel programs. It automatically searches for the optimal combination of three basic transformation methods and switching conditions with less programmers' efforts. We implemented it as an optimization pass in LLVM. The evaluation shows the significant performance improvement (from 1.5x to 228x) over the original naive task parallel programs. Moreover, it demonstrates the absolute performance obtained by our autotuning framework was comparable to that of loop parallel programs.
机译:任务并行编程模型被认为是具有动态负载平衡的有前途的并行编程模型之一。由于此模型支持分层并行性,因此适用于并行分治算法。但是,大多数幼稚的“分而治之”任务并行程序都有很高的任务开销,因为它们倾向于创建过于精细的任务。有两个关键思想可以提高这种程序的性能:在截止条件下对任务进行序列化(这是并发减少与并行化开销之间的折衷),以及在条件下对该任务进行有效的转换。两者都对算法功能敏感,因此在某些情况下仅靠编译器进行优化是无效的。为了解决这个问题,我们为分治任务并行程序提出了一个自动调整框架。它以较少的程序员工作量自动搜索三种基本转换方法和切换条件的最佳组合。我们将其实现为LLVM中的优化过程。评估显示,与原始的朴素任务并行程序相比,性能有了显着提高(从1.5倍提高到228倍)。而且,它证明了我们的自动调整框架所获得的绝对性能可与循环并行程序相媲美。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号