首页> 外文OA文献 >Automatic SMT threading for OpenMP applications on the Intel Xeon Phi co-processor
【2h】

Automatic SMT threading for OpenMP applications on the Intel Xeon Phi co-processor

机译:英特尔至强融核协处理器上的Openmp应用程序的自动smT线程

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Simultaneous multithreading is a technique that can improve performance when running parallel applications on the Intel Xeon Phi co-processor. Selecting the most efficient thread count is however non-trivial, as the potential increase in efficiency has to be balanced against other, potentially negative factors such as inter-thread competition for cache capacity and increased synchronization overheads. In this paper, we extend CRUST (ClusteR-aware Under-subscribed Scheduling of Threads), a technique for finding the optimum thread count of OpenMP applications running on clustered cache architectures, to take the behavior of simultaneous multithreading on the Xeon Phi into account. CRUST can automatically find the optimum thread count at sub-application granularity by exploiting application phase behavior at OpenMP parallel section boundaries, and uses hardware performance counter information to gain insight into the application's behavior. We implement a CRUST prototype inside the Intel OpenMP runtime library and show its efficiency running on real Xeon Phi hardware.
机译:同步多线程技术是一种在Intel Xeon Phi协处理器上运行并行应用程序时可以提高性能的技术。但是,选择最有效的线程数并非易事,因为必须将潜在的效率提高与其他潜在的负面因素(例如线程间争用缓存容量和增加同步开销)进行权衡。在本文中,我们扩展了CRUST(了解ClusteR的线程不足订阅的线程调度)技术,该技术可找到在群集缓存体系结构上运行的OpenMP应用程序的最佳线程数,以考虑到Xeon Phi上的同时多线程行为。 CRUST可以通过利用OpenMP并行部分边界处的应用程序阶段行为来自动找到子应用程序粒度的最佳线程数,并使用硬件性能计数器信息来深入了解应用程序的行为。我们在Intel OpenMP运行时库中实现了CRUST原型,并展示了在真正的Xeon Phi硬件上运行的效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号