首页> 外文期刊>Computers, IEEE Transactions on >Macro Data Load: An Efficient Mechanism for Enhancing Loaded Data Reuse
【24h】

Macro Data Load: An Efficient Mechanism for Enhancing Loaded Data Reuse

机译:宏数据加载:一种有效的机制,可增强加载的数据重用

获取原文
获取原文并翻译 | 示例

摘要

This paper presents a study on macro data load, a novel mechanism to increase the amount of loaded data reuse within a processor. A macro data load brings into the processor a maximum-width data the cache port allows. In a 64-bit processor, for example, a byte load will bring a full 64-bit data from cache and save it in an internal hardware structure, while using for itself only the specified byte out of the 64-bit data. The saved data can be opportunistically reused by later loads internally, reducing relatively more expensive cache accesses. We present a comprehensive availability study using a generalized memory data reuse table (MDRT) to quantify available memory data reuse opportunities in a set of benchmark programs drawn from the SPEC2k and MiBench suites, and to demonstrate the efficacy of the proposed scheme. The macro data load mechanism is shown to open up significantly more loaded data reuse opportunities than previous schemes with no support for spatial locality. We observe 15.1 percent (SPEC2k integer), 20.9 percent (SPEC2k floating-point), and 45.8 percent (MiBench) more load-to-load forwarding instances when a 256-entry MDRT is used. We also describe a modified load store queue design as a possible implementation of the proposed concept. Our quantitative study using a realistic processor model shows that 21.3 percent, 14.8 percent, and 23.6 percent of L1 cache accesses in the SPEC2k integer, floating-point, and MiBench programs can be eliminated, resulting in a related energy reduction of 11.4 percent, 9.0 percent, and 14.3 percent on average, respectively.
机译:本文介绍了对宏数据加载的研究,这是一种增加处理器中已加载数据重用量的新颖机制。宏数据加载将缓存端口允许的最大宽度数据带入处理器。例如,在64位处理器中,字节加载将从缓存中获取完整的64位数据,并将其保存在内部硬件结构中,而自身仅使用64位数据中的指定字节。保存的数据可以由以后的内部机会适当地重用,从而减少了相对更昂贵的缓存访问。我们使用广义内存数据重用表(MDRT)进行了全面的可用性研究,以量化从SPEC2k和MiBench套件中提取的一组基准程序中的可用内存数据重用机会,并论证了该方案的有效性。与不支持空间局部性的以前的方案相比,宏数据加载机制显示出打开了更多的加载数据重用机会。当使用256个条目的MDRT时,我们发现负载到负载转发实例增加了15.1%(SPEC2k整数),20.9%(SPEC2k浮点数)和45.8%(MiBench)。我们还将描述一种改进的负载存储队列设计,作为所提出概念的可能实现。我们使用逼真的处理器模型进行的定量研究表明,可以消除SPEC2k整数,浮点和MiBench程序中L1缓存访问的21.3%,14.8%和23.6%,从而使能耗降低11.4%,9.0百分比和平均14.3%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号