Improving Cache Partitioning Algorithms for Pseudo-LRU Policies

Xi ZHANG; Chuanyi LIU; Zhenyu LIU; Dongsheng WANG

首页> 外文期刊>IEICE Transactions on Information and Systems >Improving Cache Partitioning Algorithms for Pseudo-LRU Policies

【24h】

Improving Cache Partitioning Algorithms for Pseudo-LRU Policies

机译：改进用于伪LRU策略的缓存分区算法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

As the number of concurrently running applications on the chip multiprocessors (CMPs) is increasing, efficient management of the shared last-level cache (LLC) is crucial to guarantee overall performance. Recent studies have shown that cache partitioning can provide benefits in throughput, fairness and quality of service. Most prior arts apply true Least Recently Used (LRU) as the underlying cache replacement policy and rely on its stack property to work properly. However, in commodity processors, pseudo-LRU policies without stack property are commonly used instead of LRU'for their simplicity and low storage overhead. Therefore, this study sets out to understand whether LRU-based cache partitioning techniques can be applied to commodity processors. In this work, we instead propose a cache partitioning mechanism for two popular pseudo-LRU policies: Not Recently Used (NRU) and Binary Tree (BT). Without the help of true LRU's stack property, we propose a profiling logic that applies curve approximation methods to derive the hit curve (hit counts under varied way allocations) for an application. We then propose a hybrid partitioning mechanism, which mitigates the gap between the predicted hit curve and the actual statistics. Simulation results demonstrate that our proposal can improve throughput by 15.3% on average and outperforms the stack-estimate proposal by 12.6% on average. Similar results can be achieved in weighted speedup. For the cache configurations under study, it requires less than 0.5% storage overhead compared to the last-level cache. In addition, we also show that profiling mechanism with only one true LRU ATD achieves comparable performance and can further reduce the hardware cost by nearly two thirds compared with the hybrid mechanism.

机译：随着芯片多处理器（CMP）上同时运行的应用程序数量的增加，对共享的最后一级缓存（LLC）的有效管理对于保证整体性能至关重要。最近的研究表明，缓存分区可以提高吞吐量，公平性和服务质量。大多数现有技术将真正的最近最少使用（LRU）作为基础的缓存替换策略，并依靠其堆栈属性来正常工作。但是，在商用处理器中，由于其简单性和低存储开销，通常使用不具有堆栈属性的伪LRU策略来代替LRU'。因此，本研究着手了解是否可以将基于LRU的缓存分区技术应用于商品处理器。在这项工作中，我们改为为两种流行的伪LRU策略提出了一种缓存分区机制：“最近未使用（NRU）”和“二叉树（BT）”。在没有真正的LRU的堆栈属性帮助的情况下，我们提出了一种配置逻辑，该逻辑应用曲线近似方法来为应用程序导出命中曲线（在不同方式分配下的命中计数）。然后，我们提出了一种混合分区机制，该机制可减轻预测的命中曲线与实际统计数据之间的差距。仿真结果表明，我们的建议可以平均提高吞吐量15.3％，并且比堆栈估计的建议平均提高12.6％。加权加速可以实现类似的结果。对于正在研究的缓存配置，与上一级缓存相比，它需要不到0.5％的存储开销。此外，我们还表明，与混合机制相比，仅具有一个真正的LRU ATD的概要分析机制可实现可比的性能，并可将硬件成本进一步降低近三分之二。

著录项

来源
《IEICE Transactions on Information and Systems》 |2013年第12期|2514-2523|共10页
作者
Xi ZHANG; Chuanyi LIU; Zhenyu LIU; Dongsheng WANG;
展开▼
作者单位

Key Laboratory of Trustworthy Dis-tributed Computing and Service of Ministry of Education, Bei-jing University of Post and Telecommunications, Beijing, 100876 China;

Key Laboratory of Trustworthy Dis-tributed Computing and Service of Ministry of Education, Bei-jing University of Post and Telecommunications, Beijing, 100876 China;

Tsinghua National Laboratory for In-formation Science and Technology, Tsinghua University, Beijing, 100084 China;

Tsinghua National Laboratory for In-formation Science and Technology, Tsinghua University, Beijing, 100084 China;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
cache partitioning; pseudo-LRU; curve approximation;

机译：缓存分区;伪LRU曲线逼近;

相似文献

外文文献
中文文献
专利

1. Improving Cache Partitioning Algorithms for Pseudo-LRU Policies [J] . Xi ZHANG, Chuanyi LIU, Zhenyu LIU, IEICE transactions on information and systems . 2013,第12期

机译：改进用于伪LRU策略的缓存分区算法
2. PBBCache: An open-source parallel simulator for rapid prototyping and evaluation of cache-partitioning and cache-clustering policies [J] . Garcia-Garcia Adrian, Carlos Saez Juan, Luis Risco-Martin Jose, Journal of computational science . 2020,第Apra期

机译：PBBCache：一个开源并行模拟器，用于快速原型设计和缓存分区和缓存群集策略的评估
3. High-Endurance Hybrid Cache Design in CMP Architecture With Cache Partitioning and Access-Aware Policies [J] . Lin Ing-Chao, Chiou Jeng-Nian Very Large Scale Integration (VLSI) Systems, IEEE Transactions on . 2015,第10期

机译：具有高速缓存分区和访问感知策略的CMP体系结构中的高耐久性混合高速缓存设计
4. Adapting cache partitioning algorithms to pseudo-LRU replacement policies [C] . Kedzierski Kamil, Moreto Miquel, Cazorla Francisco J., 2010 IEEE International Symposium on Parallel amp; Distributed Processing (IPDPS) . 2010

机译：使高速缓存分区算法适应伪LRU替换策略
5. Using Statistical Analysis to Improve Data Partitioning in Algorithms for Data Parallel Processing Implementation [D] . Hidalgo Murillo, Manuel E. 2016

机译：在数据并行处理实现算法中，使用统计分析来改善数据划分
6. An Energy-Efficient Coverage Enhancement Strategy for Wireless Sensor Networks Based on a Dynamic Partition Algorithm for Cellular Grids and an Improved Vampire Bat Optimizer [O] . Xiaoqiang Zhao, Yanpeng Cui, Zheng Guo, 2020

机译：基于蜂窝网格动态分区算法和改进的吸血蝙蝠优化程序的无线传感器网络节能覆盖策略
7. Improving Cache Partitioning Algorithms for Pseudo-LRU Policies [O] . Xi ZHANG, Chuanyi LIU, Zhenyu LIU, 2013

机译：改进伪LRU策略的缓存分区算法
8. Improved spectral graph partitioning algorithm for mapping parallel computations [R] . Hendrickson, B, Leland, R 1992

机译：用于映射并行计算的改进的谱图分区算法

Improving Cache Partitioning Algorithms for Pseudo-LRU Policies

摘要

著录项

相似文献

相关主题

期刊订阅