首页> 外文会议>International Symposium on High Performance Computer Architecture >Techniques for Bandwidth-Efficient Prefetching of Linked Data Structures in Hybrid Prefetching Systems
【24h】

Techniques for Bandwidth-Efficient Prefetching of Linked Data Structures in Hybrid Prefetching Systems

机译:混合预取系统中的带宽数据结构的带宽预取技术

获取原文

摘要

Linked data structure (LDS) accesses are critical to the performance of many large scale applications. Techniques have been proposed to prefetch such accesses. Unfortunately, many LDS prefetching techniques 1) generate a large number of useless prefetches, thereby degrading performance and bandwidth efficiency, 2) require significant hardware or storage cost, or 3) when employed together with stream-based prefetchers, cause significant resource contention in the memory system. As a result, existing processors do not employ LDS prefetchers even though they commonly employ stream-based prefetchers. This paper proposes a low-cost hardware/software cooperative technique that enables bandwidth-efficient prefetching of linked data structures. Our solution has two new components: 1) a compiler-guided prefetch filtering mechanism that informs the hardware about which pointer addresses to prefetch, 2) a coordinated prefetcher throttling mechanism that uses run-time feedback to manage the interference between multiple prefetchers (LDS and stream-based) in a hybrid prefetching system. Evaluations show that the proposed solution improves average performance by 22.5% while decreasing memory bandwidth consumption by 25% over a baseline system that employs an effective stream prefetcher on a set of memory- and pointer-intensive applications. We compare our proposal to three different LDS/correlation prefetching techniques and find that it provides significantly better performance on both single-core and multi-core systems, while requiring less hardware cost.
机译:链接的数据结构(LDS)的访问对许多大规模应用的性能至关重要。技术已经被提出来预取这样的访问。不幸的是,许多LDS预取技术1)产生大量无用的预取的,从而降低性能和带宽效率,2)要求显著硬件或存储成本,或3)当与基于流的预取一起使用,原因在显著资源争用内存系统。其结果是,现有的处理器不采用LDS预取,即使他们通常采用基于流的预取。本文提出了一种低成本的硬件/软件合作技术,使连接数据结构的带宽有效的预取。我们的解决方案有两个新的组件:1)编译器引导的预取过滤机构,其围绕该指针地址到预取,2)以协调的预取节流机制使用运行时反馈通知硬件管理多个预取之间的干扰(LDS和流为主)在混合预取系统。评价表明,所提出的解决方案通过22.5%平均提高性能,同时在其采用一种有效的流预取在一组的内存和指针密集型应用的基线系统由25%降低存储器带宽消耗。我们我们的建议比较三种不同的LDS /相关预取技术,并发现它提供单核和多核系统显著更好的性能,而需要较少的硬件成本。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号