首页> 外文学位 >Cooperative hardware/software caching for next-generation memory systems.
【24h】

Cooperative hardware/software caching for next-generation memory systems.

机译:下一代内存系统的协作式硬件/软件缓存。

获取原文
获取原文并翻译 | 示例

摘要

The memory system remains a major performance bottleneck in modern and future architectures. In this dissertation, we propose a hardware/software cooperative approach and demonstrate its effectiveness. This approach combines the global yet imperfect view of the compiler with the timely yet narrow-scope context of the hardware. It relies on a light-weight extension to the instruction set architecture to convey compile-time knowledge (hints) to the hardware. The hardware then uses these hints to make better decisions.; Our work shows that a cooperative hardware/software approach to (1) cache replacement, (2) prefetching, and (3) their combination eliminates or tolerates much of the memory performance bottleneck. (1) Our work enhances cache replacement decisions using compiler hints. The compiler detects which data will or will not be reused and annotates loads accordingly. The compiler sets one bit (the evict-me bit) to denote a preferred eviction candidate. On a miss, the cache replacement algorithm preferentially replaces a cache line with its evict-me bit set. Otherwise, it follows the LRU policy. The evict-me replacement scheme improves cache replacement decisions and is effective in both L1 and L2 caches. (2) We also use compiler hints to direct aggressive hardware region prefetching and content-aware pointer prefetching. The original SRP (scheduled region prefetching) engine queues prefetching requests on every outstanding L2 miss and tolerates latencies at the cost of dramatically increasing the memory traffic. GRP (guided region prefetching) enhances SRP by restricting prefetching to compiler-marked loads. Our compiler algorithms effectively mark spatial reuses across the SPEC CPU2000 benchmarks, and thus GRP achieves the performance of SRP with only one eighth of the additional traffic. (3) The evict-me cache replacement scheme helps alleviate the side effects of cache pollution introduced by useless region prefetches. The combination of evict-me caching and region prefetching further improves cache performance. These results demonstrate significant promise for overcoming the memory bottleneck with cooperative hardware/software techniques.
机译:在现代和未来的体系结构中,内存系统仍然是主要的性能瓶颈。本文提出了一种软/硬件协同方法,并证明了其有效性。这种方法将编译器的全局但不完善的视图与硬件的及时而狭窄的范围相结合。它依赖于对指令集体系结构的轻量级扩展,以将编译时知识(提示)传达给硬件。然后,硬件使用这些提示做出更好的决策。我们的工作表明,一种协作的硬件/软件方法可以实现(1)缓存替换,(2)预取和(3)它们的组合消除或容忍了很多内存性能瓶颈。 (1)我们的工作使用编译器提示增强了缓存替换决定。编译器检测哪些数据将被重用或将不会被重用,并相应地注释负载。编译器将一个位( evct-me 位)设置为表示首选驱逐候选对象。如果未命中,则高速缓存替换算法优先使用其evict-me位集替换高速缓存行。否则,它遵循LRU策略。逐出替换方案改进了缓存替换决策,并且在L1和L2缓存中均有效。 (2)我们还使用编译器提示来指导积极的硬件区域预取和内容感知指针预取。原始的SRP(计划的区域预取)引擎对每个未决的L2未命中的预取请求进行排队,并以极大地增加内存流量为代价来容忍延迟。 GRP(引导区域预取)通过将预取限制为编译器标记的负载来增强SRP。我们的编译器算法有效地标记了SPEC CPU2000基准测试中的空间重用,因此GRP只需八分之一的额外流量即可达到SRP的性能。 (3)evict-me缓存替换方案有助于减轻无用区域预取带来的缓存污染的副作用。逐行缓存和区域预取的结合进一步提高了缓存性能。这些结果证明了通过协作的硬件/软件技术克服存储瓶颈的巨大希望。

著录项

  • 作者

    Wang, Zhenlin.;

  • 作者单位

    University of Massachusetts Amherst.;

  • 授予单位 University of Massachusetts Amherst.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2004
  • 页码 143 p.
  • 总页数 143
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号