Locality Utility Co-optimization for Practical Capacity Management of Shared Last Level Caches

机译：共享的最后一级缓存的实际容量管理的位置和实用程序协同优化

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Shared last-level caches (SLLCs) on chip-multiprocessors play an important role in bridging the performance gap between processing cores and main memory. Although there are already many proposals targeted at overcoming the weaknesses of the least-recently-used (LRU) replacement policy by optimizing either locality or utility for heterogeneous workloads, very few of them are suitable for practical SLLO designs due to their large overhead of log associativity bits per cache line for re-reference interval prediction. The two recently proposed practical replacement policies, TA-DRRIP and SHiP, have significantly reduced the overhead by relying on just 2 bits per line for prediction, but they are oriented towards managing locality only, missing the opportunity provided by utility optimization. This paper is motivated by our two key experimental observations: (i) the not-recently-used (NRU) replacement policy that entails only one bit per line for prediction can satisfactorily approximate the LRU performance; (ii) since locality and utility optimization opportunities are concurrently present in heterogeneous workloads, the co-optimization of both would be indispensable to higher performance but is missing in existing practical SLLC schemes. Therefore, we propose a novel practical SLLC design, called GOOF', which needs just one bit per line for re-reference interval prediction, and leverages lightweight per-core locality & utility monitors that profile sample SLLC sets to guide the co-optimization. COOP offers significant throughput improvement over LRU by 7.67% on a quad-core CMP with a 4MB SLLC for 200 random workloads, outperforming both of the recent practical replacement policies at the in-between cost of 17.74KB storage overhead (TA-DRRIP: 4.53% performance improvement with 16KB storage cost; SHiP: 6.00%. performance improvement with 25.75KB storage overhead).

机译：芯片多处理器上共享的最后一级缓存（SLLC）在弥合处理内核与主内存之间的性能差距方面发挥着重要作用。尽管已经有很多提案旨在通过针对异构工作负载优化位置或实用性来克服最近使用（LRU）替换策略的弱点，但是由于它们的日志开销很大，因此很少有适合于实际SLLO设计的提案。每条高速缓存行的关联位，用于重新参考间隔预测。最近提出的两种实用的替换策略TA-DRRIP和SHiP仅依靠每行2位进行预测，从而显着减少了开销，但它们仅针对管理位置，而错过了效用优化所提供的机会。本文的灵感来自于我们的两个主要实验观察结果：（i）最近使用的（NRU）替换策略（每行仅需要一位来进行预测）就可以令人满意地近似LRU性能; （ii）由于在异构工作负载中同时存在位置和实用程序优化机会，因此两者的共同优化对于更高的性能将是必不可少的，但在现有的实用SLLC方案中却缺少。因此，我们提出了一种新颖实用的SLLC设计，称为GOOF'，每行仅需要一位即可进行重新参考间隔预测，并利用轻量级的每核局部性和实用程序监视器来配置样本SLLC设置，以指导共同优化。 COOP在具有4MB SLLC的四核CMP上可处理200个随机工作负载，与LRU相比，吞吐量显着提高了7.67％，以介于17.74KB的存储开销之间的中间成本，胜过了最近的两种实用替换策略（TA-DRRIP：4.53）％的性能提升，存储成本为16KB; SHiP：6.00％，性能提升，存储开销为25.75KB。

著录项

来源
《ACM international conference on supercomputing》|2012年|279-289|共11页
会议地点
作者
Dongyuan Zhan; Hong Jiang; Sharad C. Seth;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Chip Multiprocessors; Shared Last Level Caches; Practical Capacity Management; Locality Utility Co-Optimization;

机译：芯片多处理器;共享的最后一级缓存;实际能力管理;位置和效用协同优化;

相似文献

外文文献
中文文献
专利

1. CLU: Co-Optimizing Locality and Utility in Thread-Aware Capacity Management for Shared Last Level Caches [J] . Zhan D., Jiang H., Seth S.C. Computers, IEEE Transactions on . 2014,第7期

机译：CLU：共同优化线程感知的容量管理中的共享本地末级缓存的位置和实用性
2. Exploiting Reuse Locality on Inclusive Shared Last-Level Caches [J] . JORGE ALBERICIO, PABLO IBANEZ, VICTOR VINALS, ACM Transactions on Architecture and Code Optimization . 2012,第4期

机译：在包含共享的最后一级缓存中利用重用位置
3. Line Sharing Cache: Exploring Cache Capacity with Frequent Line Value Locality [J] . Keitarou OKA, Hiroshi SASAKI, Koji INOUE 電子情報通信学会技術研究報告 . 2013,第451期

机译：线路共享缓存：使用频繁的线路值局部性来探索缓存容量
4. Locality Utility Co-optimization for Practical Capacity Management of Shared Last Level Caches [C] . Dongyuan Zhan, Hong Jiang, Sharad C. Seth ACM international conference on supercomputing . 2012

机译：用于共享最后一级缓存的实际容量管理的地方和实用程序共同优化
5. Spatiotemporal capacity management for the last level caches of chip multiprocessors. [D] . Zhan, Dongyuan. 2012

机译：芯片多处理器最后一级缓存的时空容量管理。
6. Identification of shared single copy nuclear genes in Arabidopsis Populus Vitis and Oryza and their phylogenetic utility across various taxonomic levels [O] . Jill M Duarte, P Kerr Wall, Patrick P Edger, 2010

机译：鉴定拟南芥胡杨葡萄和稻中共有的单拷贝核基因及其在不同分类学水平上的系统发生作用
7. CLU: Co-Optimizing Locality and Utility in Thread-Aware Capacity Management for Shared Last Level Caches [O] . Zhan, Dongyuan, Jiang, Hong, Seth, Sharad C. 2014

机译：CLU：共同优化地方和公用事业线程感知容量管理共享最后一级缓存

Locality Utility Co-optimization for Practical Capacity Management of Shared Last Level Caches

摘要

著录项

相似文献

相关主题

期刊订阅