Runtime support for integrating precomputation and thread-level parallelism on simultaneous multithreaded processors

机译：运行时支持，用于在同时多线程处理器上集成预计算和线程级并行性

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This paper presents runtime mechanisms that enable flexible use of speculative precomputation in conjunction with thread-level parallelism on SMT processors. The mechanisms were implemented and evaluated on a real multi-SMT system. So far, speculative precomputation and thread-level parallelism have been used disjunctively on SMT processors and no attempts have been made to compare and possibly combine these techniques for further optimization. We present runtime support mechanisms for coordinating precomputation with its sibling computation, so that precomputation is regulated to avoid cache pollution and sufficient runahead distance is allowed from the targeted computation. We also present a task queue mechanism to orchestrate precomputation and thread-level parallelism, so that they can be used conjunctively in the same program. The mechanisms are motivated by the observation that different parts of a program may benefit from different modes of multithreaded execution. Furthermore, idle periodsduring TLP execution or sequential sections can be used for precomputation and vice versa. We apply the mechanisms in loop-structured scientific codes. We present experimental results that verify that no single technique (precomputation or TLP) in isolation achieves the best performance in all cases. Efficient combination of precomputation and TLP is most often the best solution.

机译：本文介绍了运行时机制，该机制可在SMT处理器上结合线程级并行性灵活地使用推测性预计算。这些机制是在真正的多SMT系统上实施和评估的。到目前为止，推测性预计算和线程级并行性已在SMT处理器上脱节使用，并且未进行任何比较或可能结合使用这些技术以进行进一步优化的尝试。我们提出了用于协调预计算与其同级计算的运行时支持机制，以便对预计算进行调节以避免缓存污染，并允许目标计算有足够的超前距离。我们还提出了一种任务队列机制来协调预计算和线程级并行性，以便它们可以在同一程序中联合使用。这些机制是由于观察到程序的不同部分可能会受益于多线程执行的不同模式而产生的。此外，可以将TLP执行期间的空闲周期或顺序段用于预计算，反之亦然。我们将这些机制应用到循环结构的科学代码中。我们提供的实验结果证明，在所有情况下，没有一种单独的技术（预计算或TLP）能够孤立地达到最佳性能。预计算和TLP的有效组合通常是最好的解决方案。

著录项

来源
《Proceedings of the 7th workshop on Workshop on languages, compilers, and run-time support for scalable systems》|2004年|P.1-12|共12页
会议地点 Houston TX(US)
作者
Tanping Wang; Filip Blagojevic; Dimitrios S. Nikolopoulos;
展开▼
作者单位

The College of William and Mary, McGlothlin-Street Hall, Williamsburg VA;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Dual-thread Speculation: A Simple Approach to Uncover Thread-level Parallelism on a Simultaneous Multithreaded Processor [J] . Fredrik Warg, Per Stenstrom International journal of parallel programming . 2008,第2期

机译：双线程推测：一种在同步多线程处理器上发现线程级并行性的简单方法
2. Memory-Level Parallelism Aware Fetch Policies for Simultaneous Multithreading Processors [J] . STUN EYERMAN, LIEVEN EECKHOUT ACM Transactions on Architecture and Code Optimization . 2009,第1期

机译：同步多线程处理器的内存级并行感知获取策略
3. Improving Server Software Support for Simultaneous Multithreaded Processors [J] . Luke K. McDowell, Susan J. Eggers, Steven D. Gribble ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2003,第10期

机译：改进对同时多线程处理器的服务器软件支持
4. Supporting Speculative Multithreading on Simultaneous Multithreaded Processors [C] . Venkatesan Packirisamy, Shengyue Wang, Antonia Zhai, High Performance Computing - HiPC 2006; Lecture Notes in Computer Science; 4297 . 2006

机译：在同时多线程处理器上支持推测性多线程
5. Efficient Runtime Support for Reliable and Scalable Parallelism. [D] . Zhang, Minjia. 2016

机译：有效运行时支持可靠和可扩展的并行性。
6. Exploiting Thread-Level and Instruction-Level Parallelism to Cluster Mass Spectrometry Data using Multicore Architectures [O] . Fahad Saeed, Jason D. Hoffert, Trairak Pisitkun, -1

机译：利用多核体系结构利用线程级和指令级并行性对质谱数据进行聚类
7. Runtime Support for Integrating Precomputation and Thread-Level Parallelism on Simultaneous Multithreaded Processors [O] . Wang, T., Blagojevic, F., Nikolopoulos, Dimitrios 2004

机译：在同时多线程处理器上集成预计算和线程级并行性的运行时支持

Runtime support for integrating precomputation and thread-level parallelism on simultaneous multithreaded processors

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅