首页> 外文会议>International conference on parallel architectures and compilation techniques >Introducing Hierarchy-awareness in Replacement and Bypass Algorithms for Last-level Caches
【24h】

Introducing Hierarchy-awareness in Replacement and Bypass Algorithms for Last-level Caches

机译:在末级缓存的替换和旁路算法中引入层次结构意识

获取原文

摘要

The replacement policies for the last-level caches (LLCs) are usually designed based on the access information available locally at the LLC. These policies are inherently sub-optimal due to lack of information about the activities in the inner-levels of the hierarchy. This paper introduces cache hierarchy-aware replacement (CHAR) algorithms for inclusive LLCs (or L3 caches) and applies the same algorithms to implement efficient bypass techniques for exclusive LLCs in a three-level hierarchy. In a hierarchy with an inclusive LLC, these algorithms mine the L2 cache eviction stream and decide if a block evicted from the L2 cache should be made a victim candidate in the LLC based on the access pattern of the evicted block. Ours is the first proposal that explores the possibility of using a subset of L2 cache eviction hints to improve the replacement algorithms of an inclusive LLC. The CHAR algorithm classifies the blocks residing in the L2 cache based on their reuse patterns and dynamically estimates the reuse probability of each class of blocks to generate selective replacement hints to the LLC. Compared to the static re-reference interval prediction (SRRIP) policy, our proposal offers an average reduction of 10.9% in LLC misses and an average improvement of 3.8% in instructions retired per cycle (IPC) for twelve single-threaded applications. The corresponding reduction in LLC misses for one hundred 4-way multi-programmed workloads is 6.8% leading to an average improvement of 3.9%, in throughput. Finally, our proposal achieves an 11.1% reduction in LLC misses and a 4.2% reduction in parallel execution cycles for six 8-way threaded shared memory applications compared to the SRRIP policy. In a cache hierarchy with an exclusive LLC, our CHAR proposal offers an effective algorithm for selecting the subset of blocks (clean or dirty) evicted from the L2 cache that need not be written to the LLC and can be bypassed. Compared to the TC-AGE policy (analogue of SRRIP for exclusive LLC), our best exclusive LLC proposal improves average throughput by 3.2% while saving an average of 66.6% of data transactions from the L2 cache to the on-die interconnect for one hundred 4-way multi-programmed workloads. Compared to an inclu- sive LLC design with an identical hierarchy, this corresponds to an average throughput improvement of 8.2% with only 17% more data write transactions originating from the L2 cache.
机译:通常基于LLC本地可用的访问信息来设计最后一级缓存(LLC)的替换策略。由于缺乏有关层次结构内部活动的信息,因此这些策略本质上是次优的。本文介绍了针对包含LLC(或L3缓存)的缓存层次结构感知替换(CHAR)算法,并应用相同的算法为三级层次结构的独占LLC实现有效的旁路技术。在具有包容性LLC的层次结构中,这些算法挖掘L2缓存逐出流,并基于被逐出的块的访问模式来决定是否应将从L2缓存逐出的块设为LLC中的受害者候选对象。我们的第一个提案探讨了使用L2缓存逐出提示的子集来改善包含性LLC的替换算法的可能性。 CHAR算法根据重用模式对L2高速缓存中驻留的块进行分类,并动态估计每类块的重用概率,以生成针对LLC的选择性替换提示。与静态重新引用间隔预测(SRRIP)策略相比,我们的建议为12个单线程应用程序提供了LLC丢失的平均减少10.9%,每周期退休指令(IPC)的平均改善了3.8%。一百种4向多程序工作负载的LLC丢失相应减少了6.8%,平均吞吐量提高了3.9%。最后,与SRRIP策略相比,对于六个8路线程共享内存应用程序,我们的建议实现了LLC丢失减少11.1%,并行执行周期减少4.2%。在具有排他LLC的高速缓存层次结构中,我们的CHAR建议提供了一种有效的算法,用于选择从L2高速缓存中逐出的块子集(干净的或脏的),这些子集不需要写入LLC即可被绕开。与TC-AGE策略(专用LLC的SRRIP类似)相比,我们最好的专用LLC提议将平均吞吐量提高了3.2%,同时平均节省了从L2缓存到芯片上互连的数据交易的66.6%,减少了一百4路多程序工作负载。与具有相同层次结构的包含式LLC设计相比,这意味着平均吞吐量提高了8.2%,而来自L2缓存的数据写入事务仅增加了17%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号