首页> 外文会议>2011 17th IEEE International Conference on Parallel and Distributed Systems >Set Utilization Based Dynamic Shared Cache Partitioning
【24h】

Set Utilization Based Dynamic Shared Cache Partitioning

机译:基于集合利用率的动态共享缓存分区

获取原文

摘要

As the number of processors sharing a cache increases, conflict misses due to interference amongst competing processes have an increasing impact on the individual performance of processes. Cache partitioning is a method of allocating a cache between concurrently executing processes in order to counteract the effects of inter-process conflicts. However, cache partitioning methods commonly divide a shared cache into private partitions dedicated to a single processor, which can lead to underutilized portions of the cache when set accesses are non-uniform. Our proposed method compliments these cache partitioning algorithms by creating an additional shared partition able to be shared amongst all processors. Underutilized areas of the cache are identified by a monitoring circuit and used for the shared partition. Detection of underutilization is based on the number of unique set accesses for a given allocated way. For a 16-way set associative cache, the implementation of our method requires 64 bytes of storage overhead per core in addition to that needed for the method that determines the sizes of the private partitions. For the tested system, our method is able to improve performance over the traditional LRU policy for a number of selected benchmark sets by an average of 1.4% and up to 13.3% for a two core system and an average of 1.4% and up to 7.8% for a four core system, and is able to improve the performance of a conventional cache partitioning method (Utility-Based Cache Partitioning) by an average of 0.1% and up to 0.5% for both a two and four core systems.
机译:随着共享高速缓存的处理器数量的增加,由于竞争进程之间的干扰而导致的冲突遗漏对进程的单个性能产生越来越大的影响。高速缓存分区是一种在并发执行的进程之间分配高速缓存以抵消进程间冲突影响的方法。但是,高速缓存分区方法通常将共享高速缓存划分为专用于单个处理器的专用分区,当集合访问不一致时,这可能导致高速缓存的利用率不足。我们提出的方法通过创建一个能够在所有处理器之间共享的附加共享分区来补充这些缓存分区算法。缓存未充分利用的区域由监视电路标识,并用于共享分区。未充分利用的检测是基于给定分配方式的唯一集合访问的数量。对于16路集关联缓存,我们的方法的实现除确定专用分区大小的方法所需的开销外,每个内核还需要64字节的存储开销。对于经过测试的系统,对于许多选定的基准集,我们的方法能够将性能提高到传统LRU策略之上,对于两核系统,平均提高了1.4%,最高提高了13.3%,而两核系统则提高了1.4%,最高达到7.8对于四核系统,它的性能为%,并且对于两核和四核系统,它能够将传统的高速缓存分区方法(基于实用程序的高速缓存分区)的性能平均提高0.1%,最高可提高0.5%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号