首页> 外文OA文献 >Adaptive Granularity Based Last-Level Cache Prefetching Method with eDRAM Prefetch Buffer for Graph Processing Applications
【2h】

Adaptive Granularity Based Last-Level Cache Prefetching Method with eDRAM Prefetch Buffer for Graph Processing Applications

机译:基于Adapram预取缓冲区的自适应粒度基于最后级缓存预取方法,用于图形处理应用程序

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The emergence of big data processing and machine learning has triggered the exponential growth of the working set sizes of applications. In addition, several modern applications are memory intensive with irregular memory access patterns. Therefore, we propose the concept of adaptive granularities to develop a prefetching methodology for analyzing memory access patterns based on a wider granularity concept that entails both cache lines and page granularity. The proposed prefetching module resides in the last-level cache (LLC) to handle the large working set of memory-intensive workloads. Additionally, to support memory access streams with variable intervals, we introduced an embedded-DRAM based LLC prefetch buffer that consists of three granularity-based prefetch engines and an access history table. By adaptively changing the granularity window for analyzing memory streams, the proposed model can swiftly and appropriately determine the stride of memory addresses to generate hidden delta chains from irregular memory access patterns. The proposed model achieves 18% and 15% improvements in terms of energy consumption and execution time compared to global history buffer and best offset prefetchers, respectively. In addition, our model reduced the total execution time and energy consumption by approximately 6% and 2.3%, compared to those of the Markov prefetcher and variable-length delta prefetcher.
机译:大数据处理和机器学习的出现引发了应用程序尺寸的指数增长。此外,几种现代应用程序是具有不规则内存访问模式的内存密集型。因此,我们提出了自适应粒度的概念来开发基于更宽的粒度概念来分析用于分析内存访问模式的预取方法,该概念需要缓存行和页面粒度。所提出的预取模块驻留在最后级别的高速缓存(LLC)中,以处理大型工作集的内存密集型工作负载。此外,为了支持具有可变间隔的内存访问流,我们引入了基于嵌入式的LLC预取缓冲区,该缓冲区由三个基于粒度的预取引擎和访问历史表组成。通过自适应地改变用于分析存储器流的粒度窗口,所提出的模型可以迅速地且适当地确定存储器地址的步幅,以从不规则的存储器访问模式生成隐藏的DELTA链。与全局历史缓冲区和最佳偏移预取符相比,所提出的模型在能量消耗和执行时间方面取得了18%和15%。此外,与Markov预取器和可变长度Delta预取器相比,我们的模型将总执行时间和能量消耗减少约6%和2.3%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号