Pre-execution via speculative data-driven multithreading.

机译：通过推测性数据驱动的多线程进行预执行。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This dissertation introduces pre-execution, a novel technique for accelerating sequential programs. Pre-execution directly attacks the instructions that cause performance problems—mis-predicted branches and cache missing loads. In pre-execution, future branch outcomes and load addresses are computed on the side and the results are fed to the main program. In doing so, the main program is spared from having to incur the full computation latencies of these instructions. Pre-execution exploits out-of-order fetch and decoupling. Fetching and executing only critical load and branch computations while skipping over all unrelated instructions allows pre-execution to compute values faster than the main program. Decoupling, doing so in a separate thread, isolates stalls that occur in these computations so that they do not directly impact the main program thread.; This dissertation describes speculative data-driven multithreading (DDMT), an implementation of pre-execution. DDMT implements the runtime component of pre-execution—responsible for pre-executing computations and communicating the results to the main program—as an extension to a superscalar processor. In addition to using the single cache hierarchy to allow pre-executing computations to prefetch for the main program, DDMT stores individual pre-executed instruction results in the shared physical register and then passes them one-by-one to the main program via a novel modification to register renaming called register integration.; For DDMT's setup component—responsible for finding load and branch computations and conveying them to the runtime component—this dissertation introduces an algorithm for automatically extracting performance-enhancing computations from program traces. The algorithm evaluates a benefit-cost function over all candidate computations in a trace and chooses those that maximize benefit (latency tolerance) while minimizing cost (execution overhead). The algorithm is formulated to permit software, hardware, and hybrid implementations.; The dissertation includes a simulation-driven performance evaluation of DDMT Our results show that DDMT achieves 10% to 15% performance improvements for general-purpose integer programs running on an aggressive baseline processor with large caches, with the potential for greater improvements on likely future processor designs. We conclude that pre-execution and DDMT are promising technologies that merit consideration for inclusion in future machines.

机译：本文介绍了 pre-execution ，它是一种加速顺序程序的新技术。预执行直接攻击导致性能问题的指令-预测错误的分支并缓存丢失的负载。在预执行中，在一侧计算将来的分支结果和加载地址，并将结果馈送到主程序。这样，主程序就不必承担这些指令的全部计算延迟。执行前利用乱序获取和解耦。在跳过所有无关指令的同时，仅获取并执行关键的负载和分支计算，可使预执行程序比主程序更快地计算值。去耦，在一个单独的线程中进行，隔离在这些计算中发生的停顿，这样它们就不会直接影响主程序线程。本文描述了执行执行的投机性数据驱动多线程（ DDMT ）。 DDMT作为超标量处理器的扩展，实现了预执行的运行时组件（负责预执行计算并将结果传达给主程序）。 DDMT除了使用单一的缓存层次结构允许预执行的计算为主程序预取外，DDMT还将各个预执行的指令结果存储在共享的物理寄存器中，然后通过一种新颖的方法将它们一一传递给主程序。对注册重命名的修改称为“ 注册集成”。对于DDMT的设置组件（负责查找负载和分支计算并将其传送到运行时组件），本论文介绍了一种算法，该算法可从程序跟踪中自动提取性能增强的计算。该算法对跟踪中所有候选计算的收益成本函数进行评估，并选择收益最大化（延迟容限）而成本最小化（执行开销）的函数。该算法被制定为允许软件，硬件和混合实现。论文包括对DDMT的仿真驱动性能评估。我们的结果表明，对于在具有大型缓存的激进基准处理器上运行的通用整数程序，DDMT可以将性能提高10％到15％，并且有可能在未来的处理器上有更大的改进设计。我们得出结论，预执行和DDMT是有前途的技术，值得考虑将其包含在将来的计算机中。

著录项

作者
Roth, Amir.;
展开▼
作者单位

The University of Wisconsin - Madison.;

展开▼
授予单位 The University of Wisconsin - Madison.;
学科 Computer Science.
学位 Ph.D.
年度 2001
页码 356 p.
总页数 356
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. A low-complexity microprocessor design with speculative pre-execution [J] . Ro WW, Gaudiot JL Journal of systems architecture . 2008,第12期

机译：具有推测性预执行功能的低复杂度微处理器设计
2. Speculative pre-execution assisted by compiler (SPEAR) [J] . Ro WW, Gaudiot JL Journal of Parallel and Distributed Computing . 2006,第8期

机译：编译器（SPEAR）辅助的推测性预执行
3. Pre-execution power consumption prediction of computational multithreaded workloads [J] . Fadishei Hamid, Deldari Hossein, Naghibzadeh Mahmoud Cluster computing . 2014,第4期

机译：计算多线程工作负载的执行前功耗预测
4. A Low-Complexity Issue Queue Design with Speculative Pre-execution [C] . Won W. Ro, Jean-Luc Gaudiot International Conference on High Performance Computing(HiPC 2005); 20051218-21; Goa(IN) . 2005

机译：具有推测性预执行的低复杂性问题队列设计
5. Cooperability: A new property for multithreading. [D] . Yi, Jaeheon. 2011

机译：协作性：多线程的新属性。
6. The Complex Pre-Execution Stage of Auditory Cognitive Control: ERPs Evidence from Stroop Tasks [O] . Bo Yu, Xunda Wang, Lin Ma, -1

机译：听觉认知控制的复杂执行前阶段：来自Stroop任务的ERPs证据
7. Speculative pre-execution assisted by compiler (SPEAR) [O] . Won W. Ro, Jean-luc Gaudiot 2006

机译：编译器（SPEAR）辅助的推测性预执行

Pre-execution via speculative data-driven multithreading.

摘要

著录项

相似文献

相关主题

期刊订阅