首页> 外文会议>IEEE International Symposium on Circuits and Systems >DRAM Access Reduction in GPUs by Thread-Block Scheduling for Overlapped Data Reuse

【24h】

DRAM Access Reduction in GPUs by Thread-Block Scheduling for Overlapped Data Reuse

机译：通过线程块调度进行重叠数据重用的线程调度DRAM访问GPU

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

General Purpose Graphics Processing Units (GPG-PUs) show very high throughput when executing parallel programs. However, they usually demand very large DRAM bandwidth and consume much power for memory access. Although recent high performance GPGPUs equip L2 cache to absorb some of DRAM accesses, the cache hit ratio can hardly be very high because of the limited cache size. We propose a GPU thread-block scheduling method that can better utilize L2 cache and reduce the DRAM memory access. This scheduling method exploits the inter-block locality in the scheduling of GPU thread-blocks. This method can easily be implemented by modifying application programs. This technique is applied to the Hotspot benchmark programs, and reduces the DRAM access by up to 39%.

机译：通用图形处理单元（GPG-PU）在执行并行程序时显示出非常高的吞吐量。但是，它们通常要求非常大的DRAM带宽，并消耗很多用于内存访问的电源。虽然近期高性能GPGPUS配备L2缓存待吸收一些DRAM访问，但由于缓存大小有限，缓存命中率几乎不能非常高。我们提出了一种GPU线程调度方法，可以更好地利用L2缓存并减少DRAM内存访问。该调度方法在GPU线程块的调度中利用块间局部性。通过修改应用程序，可以轻松实现此方法。该技术应用于热点基准程序，并将DRAM访问量降低至39％。

著录项

来源
《IEEE International Symposium on Circuits and Systems 》|2013年||共4页
会议地点
作者
Seungyeol Lee; Wonyong Sung;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类电工技术 ;
关键词

相似文献

外文文献
中文文献
专利

1. Demystifying the 16×16 thread-block for stencils on the GPU [J] . Siham Tabik, Maurice Peemen, Nicolas Guil, Concurrency and computation: practice and experience . 2015 ,第18期

机译：揭开GPU上模版的16×16线程块的神秘面纱
2. DRAM Scheduling Policy for GPGPU Architectures Based on a Potential Function [J] . Lakshminarayana Nagesh B., Lee Jaekyu, Kim Hyesoon, Computer Architecture Letters . 2012 ,第2期

机译：基于势函数的GPGPU架构DRAM调度策略
3. Test access mechanism optimization, test scheduling, and tester data volume reduction for system-on-chip [J] . Vikram lyengar, Krishnendu Chakrabarty, Marinissen E.J. IEEE Transactions on Computers . 2003 ,第12期

机译：片上系统的测试访问机制优化，测试调度和测试仪数据量减少
4. DRAM access reduction in GPUs by thread-block scheduling for overlapped data reuse [C] . Lee Seungyeol, Sung Wonyong IEEE International Symposium on Circuits and Systems . 2013

机译：通过线程块调度减少GPU中DRAM的访问，以实现重叠数据重用
5. Power-saving method for DRAM/eDRAM and 3D-DRAM exploiting the process variations, temperature changes, device degradation, and memory access workload variations and innovative heterogeneous memory management approach using 3D-DRAM with Quality of Service. [D] . Tran, Le-Nguyen. 2013

机译：DRAM / eDRAM和3D-DRAM的省电方法，利用工艺变化，温度变化，设备降级和内存访问工作负载变化，以及使用具有服务质量的3D-DRAM的创新的异构存储管理方法。
6. The DeSyGNER data access element: a readily reusable component for the construction of data-compatible multimedia programs. [O] . S. R. Deibel, R. A. Greenes 1991

机译：DeSyGNER数据访问元素：一种易于重用的组件用于构建数据兼容的多媒体程序。
7. Test Access Mechanism Optimization, Test Scheduling, and Tester Data Volume Reduction for System-on-Chip [O] . Vikram Iyengar, Student Member, Krishnendu Chakrabarty, 2003

机译：测试访问机制优化，测试调度和测试仪数据量减少片上系统

DRAM Access Reduction in GPUs by Thread-Block Scheduling for Overlapped Data Reuse

摘要

著录项

相似文献

相关主题

期刊订阅