Revisiting Using the Results of Pre-Executed Instructions in Runahead Processors

Wolff S.R.; Barnes R.D.

首页> 外文期刊>Computer Architecture Letters >Revisiting Using the Results of Pre-Executed Instructions in Runahead Processors

【24h】

Revisiting Using the Results of Pre-Executed Instructions in Runahead Processors

机译：在Runahead处理器中重新使用预执行指令的结果

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Long-latency cache accesses cause significant performance-impacting delays for both in-order and out-of-order processor systems. To address these delays, runahead pre-execution has been shown to produce speedups by warming-up cache structures during stalls caused by long-latency memory accesses. While improving cache related performance, basic runahead approaches do not otherwise utilize results from accurately pre-executed instructions during normal operation. This simple model of execution is potentially inefficient and performance constraining. However, a previous study showed that exploiting the results of accurately pre-executed runahead instructions for out-of-order processors provide little performance improvement over simple re-execution. This work will show that, unlike out-of-order runahead architectures, the performance improvement from runahead result use for an in-order pipeline is more significant, on average, and in some situations provides dramatic performance improvements. For a set of SPEC CPU2006 benchmarks which experience performance improvement from basic runahead, the addition of result use to the pipeline provided an additional speedup of 1.14× (high − 1.48×) for an in-order processor model compared to only 1.05× (high − 1.16×) for an out-of-order one. When considering benchmarks with poor data cache locality, the average speedup increased to 1.21× for in-order compared to only 1.10× for out-of-order.

机译：对于有序和无序的处理器系统，长等待时间的高速缓存访问会导致明显的影响性能的延迟。为了解决这些延迟，已经证明，超前预执行可以通过在长延迟内存访问导致的停顿期间预热缓存结构来提高速度。在提高与缓存相关的性能的同时，基本的超前运行方法在正常操作期间不会利用准确地预先执行的指令的结果。这种简单的执行模型可能效率低下，并且会限制性能。但是，先前的研究表明，为无序处理器利用准确地预先执行的超前运行指令的结果，与简单的重新执行相比，性能几乎没有提高。这项工作将表明，与无序超前运行的体系结构不同，平均而言，超前运行结果在有序管道中的使用所带来的性能提升更为显着，并且在某些情况下可以显着提高性能。对于一组从基本超前性能得到改善的SPEC CPU2006基准测试，流水线中使用结果的增加为有序处理器模型提供了1.14倍（高− 1.48倍）的额外加速，而只有1.05倍（高）。 -1.16倍）表示乱序。当考虑数据缓存局部性较差的基准测试时，有序的平均速度提高到1.21倍，而无序的平均速度仅为1.10倍。

著录项

来源
《Computer Architecture Letters》 |2014年第2期|97-100|共4页
作者
Wolff S.R.; Barnes R.D.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Benchmark testing; Hidden Markov models; Out of order; Pipeline processing; Registers; C.1.5.c Superscalar dynamically-scheduled and statically-scheduled implementation; C.1.5.e Memory hierarchy; Memory Wall; Pre-Execution; Runahead;

机译：基准测试;隐马尔可夫模型;乱序;管道处理;寄存器;C.1.5.c超标量动态调度和静态调度实现;C.1.5.e内存层次结构;内存墙;预执行;Runahead;

相似文献

外文文献
中文文献
专利

1. RUNAHEAD EXECUTION: AN EFFECTIVE ALTERNATIVE TO LARGE INSTRUCTION WINDOWS [J] . Onur Mutlu, Jared Stark, Chris Wilkerson, IEEE Micro . 2003,第6期

机译：RUNAHEAD执行：大型指令窗口的有效替代方案
2. NCOR: An FPGA-Friendly Nonblocking Data Cache for Soft Processors with Runahead Execution [J] . Kaveh Aasaraai, Andreas Moshovos International journal of reconfigurable computing . 2012,第期

机译：NCOR：适用于具有Runahead执行功能的软处理器的FPGA友好型非阻塞数据缓存
3. NCOR: An FPGA-Friendly Nonblocking Data Cache for Soft Processors with Runahead Execution [J] . KavehAasaraai, AndreasMoshovos International journal of reconfigurable computing . 2012,第7期

机译：NCOR：适用于具有Runahead执行功能的软处理器的FPGA友好型非阻塞数据缓存
4. Kilo-instruction processors, runahead and prefetching [C] . Tanausu Ramirez, Alex Pajuelo, Oliverio J. Santana, Conference on Computing frontiers . 2006

机译：基洛指令处理器，超前运行和预取
5. Morphosyntactic Processing, Cue Interaction, and the Effects of Instruction: An Investigation of Processing Instruction and the Acquisition of Case Markings in L2 German [D] . Henry, Nick. 2015

机译：形态学加工，提示互动和教学的影响：L2德语的处理指导调查及案例标志的研究
6. Library Instruction Revisited: Bibliographic Instruction Comes of Age [O] . Mindy R. Paquette-Murphy 1997

机译：再谈图书馆教学：书目教学时代的到来
7. Runahead execution: An alternative to very large instruction windows for out-of-order processors [O] . Onur Mutlu, Jared Stark, Chris Wilkerson, 2003

机译：提前执行：对于乱序处理器的超大指令窗口的替代方法
8. Revisiting COIN Theory and Instruction [R] . Pan, A. Y. 2009

机译：重新审视COIN理论与指导

Revisiting Using the Results of Pre-Executed Instructions in Runahead Processors

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅