首页> 外文会议>ACM/IEEE International Symposium on Computer Architecture >The Direct-to-Data (D2D) Cache: Navigating the Cache Hierarchy with a Single Lookup
【24h】

The Direct-to-Data (D2D) Cache: Navigating the Cache Hierarchy with a Single Lookup

机译:直接到数据(D2D)缓存:通过单个查找导航缓存层次结构

获取原文

摘要

Modern processors optimize for cache energy and performance by employing multiple levels of caching that address bandwidth, low-latency and high-capacity. A request typically traverses the cache hierarchy, level by level, until the data is found, thereby wasting time and energy in each level. In this paper, we present the Direct-to-Data (D2D) cache that locates data across the entire cache hierarchy with a single lookup. To navigate the cache hierarchy, D2D extends the TLB with per cache-line location information that indicates in which cache and way the cache line is located. This allows the D2D cache to: 1) skip levels in the hierarchy (by accessing the right cache level directly), 2) eliminate extra data array reads (by reading the right way directly), 3) avoid tag comparisons (by eliminating the tag arrays), and 4) go directly to DRAM on cache misses (by checking the TLB). This reduces the L2 latency by 40% and saves 5-17% of the total cache hierarchy energy. D2D's lower L2 latency directly improves L2 sensitive applications' performance by 5-14%. More significantly, we can take advantage of the L2 latency reduction to optimize other parts of the micro-architecture. For example, we can reduce the ROB size for the L2 bound applications by 25%, or we can reduce the L1 cache size, delivering an overall 21% energy savings across all benchmarks, without hurting performance.
机译:现代处理器通过采用多级缓存来优化缓存能量和性能,该缓存地址带宽,低延迟和高容量。请求通常遍历缓存层次结构,级别,直到找到数据,从而在每个级别中浪费时间和能量。在本文中,我们介绍了在整个缓存层次结构上定位数据的直接数据(D2D)缓存。要导航缓存层次结构,D2D将TLB扩展到每个缓存行位置信息,该信息指示其中高速缓存和缓存行所在的缓存和方式。这允许D2D缓存到:1)在层次结构中跳过级别(通过直接访问右高速缓存级别),2)消除额外的数据阵列读取(通过直接读取正确的方式),3)避免标签比较(通过消除标签阵列)和4)直接转到缓存未命中的DRAM(通过检查TLB)。这将L2延迟降低了40%,并节省了5-17%的总缓存层次能量。 D2D的较低L2延迟直接将L2敏感应用程序的性能提高5-14%。更重要的是,我们可以利用L2延迟减少,以优化微架构的其他部分。例如,我们可以将L2绑定应用程序的ROB尺寸减少25%,或者我们可以减少L1高速缓存大小,在所有基准中提供总体21%的节能,而不会损害性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号