A mechanistic model of memory level parallelism fed with cache miss rates

机译：通过高速缓存未命运率提供的记忆级并行性机械模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Non-blocking caches, which are commonly utilized in modern out-of-order processors, could handle multiple outstanding memory requests simultaneously to reduce the penalties of long latency cache misses. Memory level parallelism (MLP), which refers to the number of memory requests concurrently held by Miss Status Handling Registers (MSHRs), is an indispensable factor to estimate cache performance. To achieve MLP efficiently, previous researches oversimplified the factors that need to be considered when constructing analytical models, especially for the influences of cache miss rate. By quantifying above cache miss rate effects, this paper proposes a mechanistic model of memory level parallelism, which performs more accurate than existing works. 15 benchmarks, chosen from Mobybench 2.0, Mibench 1.0 and MediaBench II, are adopted for evaluating the accuracy of our model. Compared to Gem5 cycle-accurate simulation results, the largest root mean square error is less than 11%, while the average one is around 7%. Meanwhile, the cache performance forecasting process can be sped up about 38 times compared to the Gem5 cycle-accurate simulations.

机译：通常用于现代处理器的非阻塞缓存，可以同时处理多个未完成的内存请求以减少长期缓存未命中的惩罚。内存级并行性（MLP）是指Miss STATUS处理寄存器同时保持的内存请求数（MSHRS），是估计高速缓存性能的不可或缺的因素。为了有效地实现MLP，之前的研究过度简化了构建分析模型时需要考虑的因素，特别是对于缓存未命中率的影响。通过量化上述缓存未命中率效应，本文提出了内存级并行性的机制模型，其比现有工作更准确。从MobyBench 2.0中选择的15个基准，采用Mibench 1.0和MediaBench II来评估模型的准确性。与GEM5循环准确的仿真结果相比，最大的根均方误差小于11 ％，而平均值约为7 ％。同时，与GEM5循环准确模拟相比，缓存性能预测过程可以增加大约38次。

著录项

来源
《IEEE Pacific Rim Conference on Communications, Computers and Signal Processing》|2017年|321p|共6页
会议地点
作者
Qin Wang; Kecheng Ji; Ming Ling; Longxing Shi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN91-53;
关键词
Analytical models; Benchmark testing; Parallel processing; Out of order; Pipelines; Hardware;

机译：分析模型;基准测试;并行处理;无序;管道;硬件;

相似文献

外文文献
中文文献
专利

1. An Analytical Model for a GPU Architecture with Memory-level and Thread-level Parallelism Awareness [J] . Sunpyo Hong, Hyesoon Kim Computer architecture news . 2009,第3期

机译：具有内存级和线程级并行性意识的GPU架构分析模型
2. Simulation and Modeling of a Five -Level (NPC) Inverter Fed by a Photovoltaic Generator and Integrated in a Hybrid Wind-PV Power System [J] . M. Rezki, I. Griche Engineering Technology and Applied Science Research . 2017,第4期

机译：由光伏发电机供电并集成在混合风力光伏发电系统中的五级（NPC）逆变器的仿真和建模
3. Modeling of heat generated on stress grading coatings of motors fed by multilevel drives [J] . Espino-Cortes F.P., Gomez P., Betanzos Ramirez J.D. Dielectrics and Electrical Insulation, IEEE Transactions on . 2011,第4期

机译：多级驱动器给电机的应力分级涂层上产生的热量建模
4. A mechanistic model of memory level parallelism fed with cache miss rates [C] . Qin Wang, Kecheng Ji, Ming Ling, 2017 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing . 2017

机译：高速缓存未命中率提供的内存级并行机制模型
5. Model-driven memory optimizations for high performance computing: From caches to I/O. [D] . Frasca, Michael. 2012

机译：用于高性能计算的模型驱动的内存优化：从缓存到I / O。
6. Dietary fat increases high density lipoprotein (HDL) levels both by increasing the transport rates and decreasing the fractional catabolic rates of HDL cholesterol ester and apolipoprotein (Apo) A-I. Presentation of a new animal model and mechanistic studies in human Apo A-I transgenic and control mice. [O] . T Hayek, Y Ito, N Azrolan, 1993

机译：膳食脂肪通过增加HDL胆固醇酯和载脂蛋白（Apo）A-1的转运速率和降低分解代谢率来增加高密度脂蛋白（HDL）的水平。在人类Apo A-I转基因小鼠和对照小鼠中展示新的动物模型和机理研究。
7. Co-optimizing memory-level parallelism and cache-level parallelism [O] . Xulong Tang, Mahmut Taylan Kandemir, Mustafa Karakoy, 2019

机译：共同优化内存级并行和缓存级并行性
8. Development of integrated mechanistically-based degradation-mode models for performance assessment of high-level waste containers [R] . Bedrossian, P, Estill, J, Farmer, J, 1999

机译：开发基于机械的综合退化模式模型，用于高放废物容器的性能评估

A mechanistic model of memory level parallelism fed with cache miss rates

摘要

著录项

相似文献

相关主题

期刊订阅