首页> 外文会议>International Workshop on OpenMP >Description, Implementation and Evaluation of an Affinity Clause for Task Directives
【24h】

Description, Implementation and Evaluation of an Affinity Clause for Task Directives

机译:任务指令的相似性条款的描述,实施和评估

获取原文

摘要

OpenMP 4.0 introduced dependent tasks, which give the programmer a way to express fine grain parallelism. Using appropriate OS support (such as NUMA libraries), the runtime can rely on the information in the depend clause to dynamically map the tasks to the architecture topology. Controlling data locality is one of the key factors to reach a high level of performance when targeting NUMA architectures. On this topic, OpenMP does not provide a lot of flexibility to the programmer yet, which lets the runtime decide where a task should be executed. In this paper, we present a class of applications which would benefit from having such a control and flexibility over tasks and data placement. We also propose our own interpretation of the new affinity clause for the task directive, which is being discussed by the OpenMP Architecture Review Board. This clause enables the programmer to give hints to the runtime about tasks placement during the program execution, which can be used to control the data mapping on the architecture. In our proposal, the programmer can express affinity between a task and the following resources: a thread, a NUMA node, and a data. We then present an implementation of this proposal in the Clang-3.8 compiler, and an implementation of the corresponding extensions in our OpenMP runtime libKOMP. Finally, we present a preliminary evaluation of this work running two task-based OpenMP kernels on a 192-core NUMA architecture, that shows noticeable improvements both in terms of performance and scalability.
机译:OpenMP 4.0引入了相关任务,这为程序员提供了一种表达精细粒度并行性的方法。通过使用适当的OS支持(例如NUMA库),运行时可以依赖depends子句中的信息来将任务动态映射到体系结构拓扑。以NUMA架构为目标时,控制数据局部性是达到较高性能水平的关键因素之一。关于此主题,OpenMP尚未为程序员提供很多灵活性,这使运行时可以确定应在何处执行任务。在本文中,我们提出了一类应用程序,这些应用程序将从对任务和数据放置的控制和灵活性中受益。我们还对任务指令的新亲和性条款提出了自己的解释,OpenMP体系结构审查委员会正在讨论中。此子句使程序员能够向运行时提示程序执行过程中有关任务放置的信息,这些信息可用于控制体系结构上的数据映射。在我们的建议中,程序员可以表示任务与以下资源之间的亲和力:线程,NUMA节点和数据。然后,我们在Clang-3.8编译器中提供此建议的实现,并在我们的OpenMP运行时libKOMP中提供相应扩展的实现。最后,我们对在192核NUMA体系结构上运行两个基于任务的OpenMP内核的这项工作进行了初步评估,显示出在性能和可伸缩性方面的显着改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号