Locality Utility Co-optimization for Practical Capacity Management of Shared Last Level Caches

机译：用于共享最后一级缓存的实际容量管理的地方和实用程序共同优化

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Shared last-level caches (SLLCs) on chip-multiprocessors play an important role in bridging the performance gap between processing cores and main memory. Although there are already many proposals targeted at overcoming the weaknesses of the least-recently-used (LRU) replacement policy by optimizing either locality or utility for heterogeneous workloads, very few of them are suitable for practical SLLO designs due to their large overhead of log associativity bits per cache line for re-reference interval prediction. The two recently proposed practical replacement policies, TA-DRRIP and SHiP, have significantly reduced the overhead by relying on just 2 bits per line for prediction, but they are oriented towards managing locality only, missing the opportunity provided by utility optimization. This paper is motivated by our two key experimental observations: (i) the not-recently-used (NRU) replacement policy that entails only one bit per line for prediction can satisfactorily approximate the LRU performance; (ii) since locality and utility optimization opportunities are concurrently present in heterogeneous workloads, the co-optimization of both would be indispensable to higher performance but is missing in existing practical SLLC schemes. Therefore, we propose a novel practical SLLC design, called GOOF', which needs just one bit per line for re-reference interval prediction, and leverages lightweight per-core locality & utility monitors that profile sample SLLC sets to guide the co-optimization. COOP offers significant throughput improvement over LRU by 7.67% on a quad-core CMP with a 4MB SLLC for 200 random workloads, outperforming both of the recent practical replacement policies at the in-between cost of 17.74KB storage overhead (TA-DRRIP: 4.53% performance improvement with 16KB storage cost; SHiP: 6.00%. performance improvement with 25.75KB storage overhead).

机译：芯片 - 多处理器上的共享最后级别缓存（SLLCS）在桥接处理核心和主存储器之间的性能差距中起着重要作用。虽然已经有许多提案，但是通过优化异构工作负载的局部性或效用来克服最近使用的（LRU）替换政策的许多提案，但由于它们的较大的日志开销，它们中的很少很少适合实用的SLLO设计每个缓存行的关联比特用于重新参考间隔预测。最近提出的两个实际替代政策，TA-DRRIP和船舶，通过依赖于每行仅为预测2位，但它们仅朝着管理局部管理，缺少储蓄优化提供的机会。本文的激励是我们的两个关键实验观察：（i）未最近使用的（NRU）替换政策，这些替换政策只需要每行一行以进行预测，可以令人满意地估计LRU性能; （ii）由于地区和公用事业优化机会同时存在于异构工作负载中，因此两者的共同优化将是更高的性能，但在现有的实用SLLC方案中缺少。因此，我们提出了一种新颖的实用SLLC设计，称为GOOF'，其需要只需要每行一点进行再参考间隔预测，并利用轻量级的每核心位置和公用事业监视器，该配置文件样本SLLC集引导共同优化。 Cop在Quad-Core CMP上提供了7.67％的大量吞吐量，以4MB SLLC为200个随机工作负载，优于最近的实际替换政策，在17.74KB存储开销的成本之间进行了最近的实际替代政策（TA-DRRIP：4.53 16KB储存成本％的性能改进;船舶：6.00％。性能改善25.75KB的存储开销）。

著录项

来源
《ACM international conference on supercomputing》|2012年||共11页
会议地点
作者
Dongyuan Zhan; Hong Jiang; Sharad C. Seth;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
Chip Multiprocessors; Shared Last Level Caches; Practical Capacity Management; Locality amp; Utility Co-Optimization;

机译：芯片多处理器;共享最后一级缓存;实用容量管理;地方＆amp;公用事业协调;

相似文献

外文文献
中文文献
专利

1. CLU: Co-Optimizing Locality and Utility in Thread-Aware Capacity Management for Shared Last Level Caches [J] . Zhan D., Jiang H., Seth S.C. Computers, IEEE Transactions on . 2014,第7期

机译：CLU：共同优化线程感知的容量管理中的共享本地末级缓存的位置和实用性
2. Exploiting Reuse Locality on Inclusive Shared Last-Level Caches [J] . JORGE ALBERICIO, PABLO IBANEZ, VICTOR VINALS, ACM Transactions on Architecture and Code Optimization . 2012,第4期

机译：在包含共享的最后一级缓存中利用重用位置
3. Line Sharing Cache: Exploring Cache Capacity with Frequent Line Value Locality [J] . Keitarou OKA, Hiroshi SASAKI, Koji INOUE 電子情報通信学会技術研究報告 . 2013,第451期

机译：线路共享缓存：使用频繁的线路值局部性来探索缓存容量
4. Locality Utility Co-optimization for Practical Capacity Management of Shared Last Level Caches [C] . Dongyuan Zhan, Hong Jiang, Sharad C. Seth ACM international conference on supercomputing . 2012

机译：共享的最后一级缓存的实际容量管理的位置和实用程序协同优化
5. Spatiotemporal capacity management for the last level caches of chip multiprocessors. [D] . Zhan, Dongyuan. 2012

机译：芯片多处理器最后一级缓存的时空容量管理。
6. Identification of shared single copy nuclear genes in Arabidopsis Populus Vitis and Oryza and their phylogenetic utility across various taxonomic levels [O] . Jill M Duarte, P Kerr Wall, Patrick P Edger, 2010

机译：鉴定拟南芥胡杨葡萄和稻中共有的单拷贝核基因及其在不同分类学水平上的系统发生作用
7. CLU: Co-Optimizing Locality and Utility in Thread-Aware Capacity Management for Shared Last Level Caches [O] . Zhan, Dongyuan, Jiang, Hong, Seth, Sharad C. 2014

机译：CLU：共同优化地方和公用事业线程感知容量管理共享最后一级缓存

Locality Utility Co-optimization for Practical Capacity Management of Shared Last Level Caches

摘要

著录项

相似文献

相关主题

期刊订阅