首页> 外文会议>IEEE International Symposium on Circuits and Systems >DRAM Access Reduction in GPUs by Thread-Block Scheduling for Overlapped Data Reuse
【24h】

DRAM Access Reduction in GPUs by Thread-Block Scheduling for Overlapped Data Reuse

机译:通过线程块调度进行重叠数据重用的线程调度DRAM访问GPU

获取原文

摘要

General Purpose Graphics Processing Units (GPG-PUs) show very high throughput when executing parallel programs. However, they usually demand very large DRAM bandwidth and consume much power for memory access. Although recent high performance GPGPUs equip L2 cache to absorb some of DRAM accesses, the cache hit ratio can hardly be very high because of the limited cache size. We propose a GPU thread-block scheduling method that can better utilize L2 cache and reduce the DRAM memory access. This scheduling method exploits the inter-block locality in the scheduling of GPU thread-blocks. This method can easily be implemented by modifying application programs. This technique is applied to the Hotspot benchmark programs, and reduces the DRAM access by up to 39%.
机译:通用图形处理单元(GPG-PU)在执行并行程序时显示出非常高的吞吐量。但是,它们通常要求非常大的DRAM带宽,并消耗很多用于内存访问的电源。虽然近期高性能GPGPUS配备L2缓存待吸收一些DRAM访问,但由于缓存大小有限,缓存命中率几乎不能非常高。我们提出了一种GPU线程调度方法,可以更好地利用L2缓存并减少DRAM内存访问。该调度方法在GPU线程块的调度中利用块间局部性。通过修改应用程序,可以轻松实现此方法。该技术应用于热点基准程序,并将DRAM访问量降低至39%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号