Criticality Aware Tiered Cache Hierarchy: A Fundamental Relook at Multi-Level Cache Hierarchies

机译：重要性感知层级缓存层次结构：多级缓存层次结构的基本面

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

On-die caches are a popular method to help hide the main memory latency. However, it is difficult to build large caches without substantially increasing their access latency, which in turn hurts performance. To overcome this difficulty, on-die caches are typically built as a multi-level cache hierarchy. One such popular hierarchy that has been adopted by modern microprocessors is the three level cache hierarchy. Building a three level cache hierarchy enables a low average hit latency since most requests are serviced from faster inner level caches. This has motivated recent microprocessors to deploy large level-2 (L2) caches that can help further reduce the average hit latency. In this paper, we do a fundamental analysis of the popular three level cache hierarchy and understand its performance delivery using program criticality. Through our detailed analysis we show that the current trend of increasing L2 cache sizes to reduce average hit latency is, in fact, an inefficient design choice. We instead propose Criticality Aware Tiered Cache Hierarchy (CATCH) that utilizes an accurate detection of program criticality in hardware and using a novel set of inter-cache prefetchers ensures that on-die data accesses that lie on the critical path of execution are served at the latency of the fastest level-1 (L1) cache. The last level cache (LLC) serves the purpose of reducing slow memory accesses, thereby making the large L2 cache redundant for most applications. The area saved by eliminating the L2 cache can then be used to create more efficient processor configurations. Our simulation results show that CATCH outperforms the three level cache hierarchy with a large 1MB L2 and exclusive LLC by an average of 8.4%, and a baseline with 256KB L2 and inclusive LLC by 10.3%. We also show that CATCH enables a powerful framework to explore broad chip-level area, performance and power trade-offs in cache hierarchy design. Supported by CATCH, we evaluate radical architecture directions such as eliminating the L2 altogether and show that such architectures can yield 4.5% performance gain over the baseline at nearly 30% lesser area or improve the performance by 7.3% at the same area while reducing energy consumption by 11%.

机译：片上高速缓存是一种有助于隐藏主内存延迟的流行方法。但是，很难在不大幅增加其访问延迟的情况下构建大型缓存，这反而会损害性能。为了克服这个困难，通常将管芯上的高速缓存构建为多级高速缓存层次结构。现代微处理器已采用的一种这样流行的层次结构是三级缓存层次结构。建立三级缓存层次结构可实现较低的平均命中延迟，因为大多数请求是从更快的内部级缓存中提供服务的。这促使最近的微处理器部署大型2级（L2）缓存，可以帮助进一步减少平均命中延迟。在本文中，我们对流行的三级缓存层次结构进行了基础分析，并了解了使用程序关键度的性能交付。通过我们的详细分析，我们发现增加L2缓存大小以减少平均命中延迟的当前趋势实际上是一种无效的设计选择。取而代之，我们提出了关键性感知分层缓存层次结构（CATCH），该方法利用对硬件中程序关键性的精确检测，并使用一套新颖的缓存间预取器，确保在关键执行路径上进行片上数据访问。最快的1级（L1）缓存的延迟。最后一级缓存（LLC）的目的是减少缓慢的内存访问，从而使大型L2缓存对于大多数应用程序而言都是冗余的。然后，通过消除L2缓存节省的区域可用于创建更高效的处理器配置。我们的模拟结果表明，CATCH的性能优于三级缓存层次结构，其中大型1MB L2和独占LLC的平均性能为8.4％，而基准值为256KB L2和内含LLC的性能为10.3％。我们还表明，CATCH支持强大的框架，可在高速缓存层次结构设计中探索广泛的芯片级面积，性能和功耗之间的取舍。在CATCH的支持下，我们评估了根本性的架构发展方向，例如完全消除了L2，并显示出这样的架构可以在面积减少近30％的情况下，比基线产生4.5％的性能提升，或者在相同的面积上提高7.3％的性能，而减少11％的能源消耗。

著录项

来源
《ACM/IEEE Annual International Symposium on Computer Architecture》|2018年|96-109|共14页
会议地点
作者
Anant Vithal Nori; Jayesh Gaur; Siddharth Rai; Sreenivas Subramoney; Hong Wang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Prefetching; Microprocessors; Market research; Out of order; Hardware; Resource management; Servers;

机译：预取;微处理器;市场研究;故障;硬件;资源管理;服务器;

相似文献

外文文献
中文文献
专利

1. Secure Hierarchy-Aware Cache Replacement Policy (SHARP): Defending Against Cache-Based Side Channel Attacks [J] . Mengjia Yan, Bhargava Gopireddy, Thomas Shull, Computer architecture news . 2017,第2期

机译：安全的层次结构感知缓存替换策略（SHARP）：防御基于缓存的边信道攻击
2. Decentralized Hierarchical Coded Caching Over Heterogeneous Wireless Networks with Multi-level Popularity Content [J] . Javadi Elahe, Zeinalpour-Yazdi Zolfa, Parvaresh Farzad Wireless personal communications: An Internaional Journal . 2019,第4期

机译：具有多级人气内容的异构无线网络的分散分层编码缓存
3. Energy-aware cache hierarchy assessment targeting HEVC encoder execution [J] . Monteiro Eduarda, Grellert Mateus, Zatt Bruno, Journal of Real-Time Image Processing . 2019,第5期

机译：针对HEVC编码器执行的能量感知缓存层次评估
4. Criticality Aware Tiered Cache Hierarchy: A Fundamental Relook at Multi-Level Cache Hierarchies [C] . Anant Vithal Nori, Jayesh Gaur, Siddharth Rai, ACM/IEEE Annual International Symposium on Computer Architecture . 2018

机译：关键性了解分层缓存层次结构：多级缓存层次结构的基本reflook
5. Cache Me if You Can: Storage-aware Data Caching in Heterogeneous Wireless Edge Networks =Cache me if you can: Storage-Aware Data Caching in Heterogeneous Wireless Edge Networks [D] . Shukla, Samta. 2018

机译：如果可以，请缓存我：异构无线边缘网络中的存储感知数据缓存=如果可以，请缓存我：异构无线边缘网络中的存储感知数据缓存
6. Optimal Design of Hierarchical Cloud-FogEdge Computing Networks with Caching [O] . Xiaoqian Fan, Haina Zheng, Ruihong Jiang, 2020

机译：高速缓存等级云雾和边缘计算网络的最佳设计
7. MorphCache: A Reconfigurable Adaptive Multi-level Cache Hierarchy ∗ [O] . Shekhar Srikantaiah, Emre Kultursay, Tao Zhang, 2013

机译：MorphCache：可重新配置的自适应多级缓存层次结构

Criticality Aware Tiered Cache Hierarchy: A Fundamental Relook at Multi-Level Cache Hierarchies

摘要

著录项

相似文献

相关主题

期刊订阅