首页> 外文会议>Proceedings of the 7th workshop on Workshop on languages, compilers, and run-time support for scalable systems >Runtime support for integrating precomputation and thread-level parallelism on simultaneous multithreaded processors
【24h】

Runtime support for integrating precomputation and thread-level parallelism on simultaneous multithreaded processors

机译:运行时支持,用于在同时多线程处理器上集成预计算和线程级并行性

获取原文
获取原文并翻译 | 示例

摘要

This paper presents runtime mechanisms that enable flexible use of speculative precomputation in conjunction with thread-level parallelism on SMT processors. The mechanisms were implemented and evaluated on a real multi-SMT system. So far, speculative precomputation and thread-level parallelism have been used disjunctively on SMT processors and no attempts have been made to compare and possibly combine these techniques for further optimization. We present runtime support mechanisms for coordinating precomputation with its sibling computation, so that precomputation is regulated to avoid cache pollution and sufficient runahead distance is allowed from the targeted computation. We also present a task queue mechanism to orchestrate precomputation and thread-level parallelism, so that they can be used conjunctively in the same program. The mechanisms are motivated by the observation that different parts of a program may benefit from different modes of multithreaded execution. Furthermore, idle periodsduring TLP execution or sequential sections can be used for precomputation and vice versa. We apply the mechanisms in loop-structured scientific codes. We present experimental results that verify that no single technique (precomputation or TLP) in isolation achieves the best performance in all cases. Efficient combination of precomputation and TLP is most often the best solution.
机译:本文介绍了运行时机制,该机制可在SMT处理器上结合线程级并行性灵活地使用推测性预计算。这些机制是在真正的多SMT系统上实施和评估的。到目前为止,推测性预计算和线程级并行性已在SMT处理器上脱节使用,并且未进行任何比较或可能结合使用这些技术以进行进一步优化的尝试。我们提出了用于协调预计算与其同级计算的运行时支持机制,以便对预计算进行调节以避免缓存污染,并允许目标计算有足够的超前距离。我们还提出了一种任务队列机制来协调预计算和线程级并行性,以便它们可以在同一程序中联合使用。这些机制是由于观察到程序的不同部分可能会受益于多线程执行的不同模式而产生的。此外,可以将TLP执行期间的空闲周期或顺序段用于预计算,反之亦然。我们将这些机制应用到循环结构的科学代码中。我们提供的实验结果证明,在所有情况下,没有一种单独的技术(预计算或TLP)能够孤立地达到最佳性能。预计算和TLP的有效组合通常是最好的解决方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号