Accurately approximating superscalar processor performance from traces

机译：从迹线精确逼近超标量处理器性能

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Trace-driven simulation of superscalar processors is particularly complicated. The dynamic nature of superscalar processors combined with the static nature of traces can lead to large inaccuracies in the results, especially when traces contain only a subset of executed instructions for trace reduction. The main problem in the filtered trace simulation is that the trace does not contain enough information with which one can predict the actual penalty of a cache miss. In this paper, we discuss and evaluate three strategies to quantify the impact of a long latency memory access in a superscalar processor when traces have only L1 cache misses. The strategies are based on models about how a cache miss is treated with respect to other cache misses: (1) isolated cache miss model, (2) independent cache miss model, and (3) pairwise dependent cache miss model. Our experimental results demonstrate that the pairwise dependent cache miss model produces reasonably accurate results (4.8% RMS error) under perfect branch prediction. Our work forms a basis for fast, accurate, and configurable multicore processor simulation using a pre-determined processor core design.

机译：跟踪驱动的超标量处理器仿真特别复杂。超标量处理器的动态特性与跟踪的静态特性相结合，会导致结果中的大量误差，特别是当跟踪仅包含一部分已执行指令以减少跟踪时，尤其如此。过滤的跟踪模拟中的主要问题是，跟踪没有包含足够的信息来预测高速缓存未命中的实际代价。在本文中，我们讨论和评估三种策略，以量化当跟踪只有L1高速缓存未命中时，超标量处理器中长等待时间存储器访问的影响。该策略基于关于如何相对于其他高速缓存未命中如何处理高速缓存未命中的模型：（1）孤立的高速缓存未命中模型，（2）独立的高速缓存未命中模型，以及（3）逐对相关的高速缓存未命中模型。我们的实验结果表明，成对依赖的缓存未命中模型在完美的分支预测下产生了相当准确的结果（4.8％RMS误差）。我们的工作为使用预定的处理器内核设计进行快速，准确和可配置的多核处理器仿真奠定了基础。

著录项

来源
《Performance Analysis of Systems and Software, 2009. ISPASS 2009》|2009年|238-248|共11页
会议地点 Boston MA(US);Boston MA(US)
作者
Kiyeon Lee; Evans S.; Sangyeun Cho;
展开▼
作者单位

Dept. of Comput. Sci., Univ. of Pittsburgh, Pittsburgh, PA;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
cache storage; microprocessor chips; L1 cache misses; configurable multicore processor simulation; independent cache miss model; isolated cache miss model; pairwise dependent cache miss model; perfect branch prediction; predetermined processor core design; superscalar processor performance approximation; trace reduction; trace-driven simulation;

机译：高速缓存存储;微处理器芯片; L1高速缓存未命中;可配置的多核处理器仿真;独立的高速缓存未命中模型;隔离的高速缓存未命中模型;成对相关的高速缓存未命中模型;完美的分支预测;预定的处理器内核设计;超标量处理器性能近似;迹线减少;迹线驱动模拟;

相似文献

外文文献
中文文献
专利

1. Accurately modeling superscalar processor performance with reduced trace [J] . Kiyeon Lee, Sangyeun Cho Journal of Parallel and Distributed Computing . 2013,第4期

机译：精确建模超标量处理器性能，减少跟踪
2. Exploring the performance of split data cache schemes on superscalar processors and symmetric multiprocessors [J] . Sahuquillo J, Petit S, Pont A, Journal of systems architecture . 2005,第8期

机译：探索超标量处理器和对称多处理器上拆分数据缓存方案的性能
3. High-Performance Instruction Scheduling Circuits for Superscalar Out-of-Order Soft Processors [J] . Wong Henry, Betz Vaughn, Rose Jonathan ACM transactions on reconfigurable technology and systems . 2018,第1期

机译：超标量无序软处理器的高性能指令调度电路
4. Accurately Approximating Superscalar Processor Performance from Traces [C] . Kiyeon Lee, Shayne Evans, Sangyeun Cho IEEE International Symposium on Performance Analysis of Systems Software . 2009

机译：从迹线中准确地近似超加器处理器性能
5. Modeling out-of-order superscalar processor performance quickly and accurately with traces. [D] . Lee, Kiyeon. 2013

机译：使用跟踪快速，准确地建模乱序的超标量处理器性能。
6. A Mechanistic Study of the Association Between Symbolic Approximate Arithmetic Performance and Basic Number Magnitude Processing Based on Task Difficulty [O] . Wei Wei, Wanying Deng, Chen Chen, -1

机译：基于任务难度的符号近似算法性能与基数幅度处理关联的机理研究
7. Accurately approximating superscalar processor performance from traces [O] . Kiyeon Lee, Shayne Evans, Sangyeun Cho 2009

机译：从迹线精确逼近超标量处理器性能

Accurately approximating superscalar processor performance from traces

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅