首页> 外文期刊>Parallel and Distributed Systems, IEEE Transactions on >Unifying Fixed Code Mapping, Communication, Synchronization and Scheduling Algorithms for Efficient and Scalable Loop Pipelining
【24h】

Unifying Fixed Code Mapping, Communication, Synchronization and Scheduling Algorithms for Efficient and Scalable Loop Pipelining

机译:统一的固定代码映射,通信,同步和调度算法,可实现高效且可扩展的循环流水线

获取原文
获取原文并翻译 | 示例

摘要

Pipelining allows the execution of loop iterations with cross-iteration dependences to overlap in time, provided that the loop body is partitioned into stages such that the data dependences are not violated. Then, the stages are mapped onto threads and communication and synchronization between stages is typically achieved using queues. Pipelining techniques that rely on static scheduling perform poorly for load-imbalanced loops. Moreover, previous research efforts that achieve load-balancing are restricted to work-stealing and imply high overhead for fine-grained loops. In this article, we present URTS, a unified runtime system with compiler support that provides a lightweight dynamic scheduler by combining mapping, communication and synchronization algorithms with a suitable data structure and an efficient ticket mechanism. Particularly, URTS shows that it is possible to combine the efficiency of static scheduling with the load-imbalance tolerance of work-stealing by using a unified design that exploits the properties of a novel data structure. The evaluation on 8- and 32-core machines shows that URTS implies low overhead, of the same order as a static scheduler, for a set of benchmarks chosen from widely-used collections. URTS is a scalable solution that performs efficient dynamic scheduling for fine-grained loops, i.e., a class of interesting loops that is poorly handled by the state-of-the-art due to high overhead.
机译:流水线操作允许具有交叉重复依赖性的循环迭代的执行在时间上重叠,前提是循环主体被划分为多个阶段,从而不违反数据依赖性。然后,将阶段映射到线程上,并且通常使用队列来实现阶段之间的通信和同步。依赖静态调度的流水线技术在负载不平衡的循环中表现不佳。此外,以前实现负载平衡的研究工作仅限于窃取工作,这意味着细粒度循环的开销很大。在本文中,我们介绍了URTS,这是一个具有编译器支持的统一运行时系统,它通过将映射,通信和同步算法与适当的数据结构和有效的票证机制相结合,提供了轻量级的动态调度程序。尤其是,URTS表明,可以通过利用开发新颖数据结构特性的统一设计,将静态调度的效率与工作窃取的负载不平衡容忍度相结合。对8核和32核计算机的评估表明,对于从广泛使用的集合中选择的一组基准,URTS意味着较低的开销,与静态调度程序的开销相同。 URTS是一种可扩展的解决方案,可对细粒度的循环(即一类有趣的循环)执行高效的动态调度,由于高开销,这些循环由于最新技术而无法得到很好的处理。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号