首页> 外文会议>Conference on Computing frontiers >A case for a working-set-based memory hierarchy
【24h】

A case for a working-set-based memory hierarchy

机译:基于工作集的内存层次结构的一种情况

获取原文
获取外文期刊封面目录资料

摘要

Modern microprocessor designs continue to obtain impressive performance gains through increasing clock rates and advances in the parallelism obtained via micro-architecture design. Unfortunately, corresponding improvements in memory design technology have not been realized, resulting in latencies of over 100 cycles between processors and main memory. This ever-increasing gap in speed has pushed the current memory-hierarchy approach to its limit.Traditional approaches to memory-hierarchy management have not yielded satisfactory results. Hardware solutions require more power and energy than desired and do not scale well. Compiler solutions tend to miss too many optimization opportunities because of limited compile-time knowledge of run-time behavior. This paper explores a different approach that combines both approaches by making use of the static knowledge obtained by the compiler in the dynamic decision making of the micro-architecture. We propose a memory-hierarchy design based on working sets that uses compile-time annotations regarding the working set of memory operations to guide cache placement decisionsOur experiments show that a working-set-based memory hierarchy can significantly reduce the miss rate for memory-intensive tiled kernels by limiting cross interference. The working-set-based memory hierarchy allows the compiler to tile many loops without concern for cross interference in the cache, making tile size choice easier. In addition, the compiler can more easily tailor tile choices to the separate needs of different working sets.
机译:现代微处理器设计通过提高时钟速率和通过微体系结构设计获得的并行性不断提高,从而获得了令人印象深刻的性能提升。不幸的是,尚未实现对存储器设计技术的相应改进,从而导致处理器与主存储器之间的延迟超过100个周期。速度差距的不断扩大将当前的内存层次结构方法推向了极限。传统的内存层次结构管理方法未取得令人满意的结果。硬件解决方案需要比预期更多的功率和能量,并且扩展性不佳。由于对运行时行为的编译时知识有限,因此编译器解决方案往往会错过太多的优化机会。本文探索了一种不同的方法,该方法通过在微体系结构的动态决策中利用编译器获得的静态知识来将这两种方法结合在一起。我们提出了一种基于工作集的内存分层设计,该设计使用有关内存操作工作集的编译时批注来指导缓存放置决策我们的实验表明,基于工作集的内存分层可以显着降低内存密集型的未命中率通过限制交叉干扰来平铺内核。基于工作集的内存层次结构使编译器可以平铺许多循环而无需担心高速缓存中的交叉干扰,从而使平铺大小的选择更加容易。此外,编译器可以更轻松地根据不同工作集的不同需求来定制图块选择。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号