首页> 外文期刊>Parallel Computing >Scheduling directives: Accelerating shared-memory many-core processor execution
【24h】

Scheduling directives: Accelerating shared-memory many-core processor execution

机译:调度指令:加速共享内存多核处理器的执行

获取原文
获取原文并翻译 | 示例

摘要

We consider many-core processors with a task-graph oriented programming model, whereby scheduling constraints among tasks are decided offline, and are then enforced by the runtime system using dedicated hardware. Here, exposing and beneficially exploiting fine grain data and control parallelism is increasingly important. Therefore, high expressive power for stating such constraints/directives, along with the ability to implement them in fast, simple hardware, is critical for success. In this paper, we focus on the relationship among different duplicable (multi-instance) tasks, which are used to express and exploit data parallelism. We extend the conventional Start-After-Complete (precedence) constraint to also be usable between replicas of different such tasks rather than only between entire tasks, thereby increasing the exposable parallelism. Additionally, we propose the parameterized Start-After-Start constraint, which can be used to control the degree of "lockstep" among multiple such tasks, e.g., in order to improve cache performance when the tasks work on the same data. Also, we briefly describe several additional interesting directives. Finally, we show that the directives can be supported efficiently in hardware. Hypercore, a very efficient CREW PRAM-like shared-cache architecture, which is very challenging because it has extremely fast dispatching for basic constraints, is used in the discussion. However, the new directives have broader applicability. Having shown the possibility of simple implementation and indications of benefit, this motivates further exploration of these directives and their implementation in hardware, as well as their support by programming tools.
机译:我们考虑具有面向任务图的编程模型的多核处理器,从而可以离线确定任务之间的调度约束,然后由运行时系统使用专用硬件来实施。在这里,公开并有益地利用细粒度数据和控制并行性变得越来越重要。因此,陈述这些约束/指令的高表达能力,以及在快速,简单的硬件中实现它们的能力,对于成功至关重要。在本文中,我们关注于不同的可重复(多实例)任务之间的关系,这些任务用于表达和利用数据并行性。我们将常规的“完成后开始”(优先级)约束扩展为也可以在不同此类任务的副本之间使用,而不是仅在整个任务之间使用,从而增加了可公开的并行度。另外,我们提出了参数化的Start-After-Start约束,该约束可用于控制多个此类任务之间的“锁步”程度,例如,以便在任务处理相同数据时提高缓存性能。另外,我们简要描述了一些其他有趣的指令。最后,我们证明了指令可以在硬件中得到有效支持。讨论中使用了Hypercore,这是一种非常有效的类似于CREW PRAM的共享缓存体系结构,因为它具有针对基本约束的极快分派,因此非常具有挑战性。但是,新指令具有更广泛的适用性。在显示了简单实现的可能性和收益的指示之后,这激发了对这些指令及其在硬件中的实现以及编程工具的支持的进一步探索。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号