首页> 外文会议>The 39th International Conference on Parallel Processing >Efficient Work Stealing for Fine Grained Parallelism
【24h】

Efficient Work Stealing for Fine Grained Parallelism

机译:细粒度并行的有效工作窃取

获取原文

摘要

This paper deals with improving the performance of fine grain task parallelism. It is often either cumbersome or impossible to increase the grain size of such programs. Increasing core counts exacerbates the problem; a program that appears coarse-grained on eight cores may well look a lot more fine-grained on sixty four. In this paper we present the direct task stack, a novel work stealing algorithm with unusually low overheads, both for creating tasks and for stealing. We compare the performance of our scheduler to Cilk++, the icc implementation of OpenMP 3.0 and the Intel TBB library on an eight core, dual socket Opteron machine. We also analyze the reasons why our techniques achieve consistent speed ups over the other systems ranging from 2-3x on many fine grained workloads to over 50 in extreme cases and show quantitatively how each of the techniques we use contribute to the improved performance.
机译:本文旨在提高细粒度任务并行性的性能。增加这样的程序的粒度通常是麻烦的或不可能的。核心数量的增加加剧了该问题;在8个内核上看起来粗粒度的程序在64个内核上可能看上去更细粒度。在本文中,我们介绍了直接任务堆栈,这是一种新颖的工作窃取算法,具有异常低的开销,可用于创建任务和进行窃取。我们将调度程序的性能与Cilk ++,OpenMP 3.0的icc实现和八核双插槽Opteron机器上的Intel TBB库进行了比较。我们还分析了为什么我们的技术能够比其他系统始终如一地提高速度的原因,从许多细粒度工作负载的2-3倍提高到极端情况下的50倍以上,并定量显示了我们使用的每种技术如何有助于提高性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号