Quantifying and Optimizing the Impact of Victim Cache Line Selection in Manycore Systems

机译：量化和优化Manycore系统中受害者缓存行选择的影响

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In both architecture and software, the main goal of data locality-oriented optimizations has always been "minimizing the number of cache misses" (especially, costly last-level cache misses). However, this paper shows that other metrics such as the distance between the last-level cache and memory controller as well as the memory queuing latency can play an equally important role, as far as application performance is concerned. Focusing on a large set of multithreaded applications, we first show that the last-level cache "write backs" (memory writes due to displacement of a victim block from the last-level cache) can exhibit significant latencies as well as variances, and then make a case for "relaxing" the strict LRU policy to save (write back) cycles in both the on-chip network and memory queues. Specifically, we explore novel architecture-level schemes that optimize on-chip network latency, memory queuing latency or both, of the write back messages, by carefully selecting the victim block to write back at the time of cache replacement. Our extensive experimental evaluations using 15 multithreaded applications and a cycle-accurate simulation infrastructure clearly demonstrate that this tradeoffs (between cache hit rate and on-chip network/memory queuing latency) pays off in most of the cases, leading to about 12.2% execution time improvement and 14.9% energy savings, in our default 64-core system with 6 memory controllers.

机译：在体系结构和软件中，面向数据局部性优化的主要目标始终是“最大程度地减少高速缓存未命中的次数”（尤其是代价高昂的最后一级高速缓存未命中）。但是，本文表明，就应用程序性能而言，其他指标（例如最后一级缓存与内存控制器之间的距离以及内存排队等待时间）也可以发挥同等重要的作用。我们着眼于大量的多线程应用程序，我们首先证明了最后一级的缓存“回写”（由于受害者块从最后一级的缓存中移出而导致的内存写入）可以表现出显着的延迟和差异，然后说明“放松”严格的LRU策略以节省（写回）片上网络和内存队列中的周期。具体来说，我们探索新颖的体系结构级别的方案，通过仔细选择受害者块以在替换高速缓存时进行回写，从而优化回写消息的片上网络延迟，内存排队延迟或两者。我们使用15个多线程应用程序和周期精确的仿真基础结构进行的广泛实验评估清楚地表明，这种折衷（在高速缓存命中率和片上网络/内存排队等待时间之间）在大多数情况下是有回报的，导致执行时间约为12.2％在我们的默认64核系统（带6个内存控制器）中，改进和14.9％的节能。

著录项

来源
《2014 22nd Annual IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems 》|2014年|385-394|共10页
会议地点 Paris(FR)
作者
Kandemir Mahmut; Wei Ding; Guttman Diana;
展开▼
作者单位

Dept. of Comput. Sci. Eng., Pennsylvania State Univ., University Park, PA, USA;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
cache storage; multi-threading; multiprocessing systems; LRU policy; architecture-level schemes; cycle-accurate simulation infrastructure; data locality-oriented optimizations; energy savings; last-level cache misses; manycore systems; memory controller; memory queuing latency; multithreaded applications; on-chip network; on-chip network latency optimization; victim cache line selection; Computational modeling; Context; Data models; Memory management; Multicore processing; Optimization; System-on-chip; manycore; memory;

机译：高速缓存存储;多线程;多处理系统; LRU策略;体系结构级方案;周期精确的模拟基础结构;面向数据局部性的优化;节能;最后一级高速缓存未命中;多核系统;内存控制器;内存排队等待时间;多线程应用;片上网络;片上网络等待时间优化;受害者缓存行选择;计算建模;上下文;数据模型;内存管理;多核处理;优化;片上系统;多核;内存;

相似文献

外文文献
中文文献
专利

1. Benzene: An Energy-Efficient Distributed Hybrid Cache Architecture for Manycore Systems [J] . Kim Namhyung, Ahn Junwhan, Choi Kiyoung, ACM Transactions on Architecture and Code Optimization . 2018 ,第1期

机译：苯：多核系统的节能分布式混合缓存架构
2. Improving Data Cache Performance with Integrated Use of Split Caches, Victim Cache and Stream Buffers [J] . Afrin Naz, Mehran Rezaei, Krishna Kavi, Computer architecture news . 2005 ,第3期

机译：通过结合使用拆分缓存，受害者缓存和流缓冲区来提高数据缓存性能
3. Cache-Enabled Power Line Communication Networks: Caching Node Selection and Backhaul Energy Optimization [J] . Yuwen Qian, Liuqiang Shi, Long Shi, IEEE Transactions on Green Communications and Networking . 2020 ,第2期

机译：支持缓存的电源线通信网络：缓存节点选择和回程能量优化
4. Quantifying and Optimizing the Impact of Victim Cache Line Selection in Manycore Systems [C] . Kandemir Mahmut, Wei Ding, Guttman Diana Annual IEEE International Symposium on Modeling, Analysis Simulation of Computer and Telecommunication Systems . 2014

机译：量化和优化受害者缓存行选择在多核系统中的影响
5. Quantification of the Impact of Uncertainty in Power Systems Using Convex Optimization [D] . Choi, Hyungjin. 2017

机译：使用凸优化对电力系统不确定性的影响进行量化
6. Feature Selection and Parameters Optimization of SVM Using Particle Swarm Optimization for Fault Classification in Power Distribution Systems [O] . Ming-Yuan Cho, Thi Thom Hoang 2017

机译：基于粒子群算法的配电系统故障分类的支持向量机特征选择和参数优化
7. Impact of modern memory subsystems on cache optimizations for stencil computations [O] . Shoaib Kamil, Parry Husbands, Leonid Oliker, 2005

机译：现代内存子系统对模板计算缓存优化的影响
8. Victim Migration: Dynamically Adapting Between Private and Shared CMP Caches [R] . Zhang, M. , Asanovic, K. 2005

机译：受害者迁移：在私有和共享Cmp缓存之间动态调整

Quantifying and Optimizing the Impact of Victim Cache Line Selection in Manycore Systems

摘要

著录项

相似文献

相关主题

期刊订阅