首页> 外文会议>ACM/EDAC/IEEE Design Automation Conference >TEMP: Thread batch enabled memory partitioning for GPU
【24h】

TEMP: Thread batch enabled memory partitioning for GPU

机译:TEMP:为GPU启用线程批处理的内存分区

获取原文

摘要

As massive multi-threading in GPU imposes tremendous pressure on memory subsystems, efficient bandwidth utilization becomes a key factor affecting the GPU throughput. In this work, we propose thread batch enabled memory partitioning (TEMP), to improve GPU performance through the improvement of memory bandwidth utilization. In particular, TEMP clusters multiple thread blocks sharing the same set of pages into a thread batch and dispatches the entire thread batch to a stream multiprocessor. TEMP separates the memory access streams of different thread batches by OS memory management, preserving the intrinsic locality of thread batches and increasing the memory access parallelism. Experimental results show that TEMP can obtain up to 10.3% performance improvement and 14.6% DRAM energy reduction compared to a state-of-the-art scheduler without any memory-side optimizations.
机译:由于GPU中的大规模多线程对内存子系统施加了巨大压力,因此有效的带宽利用率已成为影响GPU吞吐量的关键因素。在这项工作中,我们提出了启用线程批处理的内存分区(TEMP),以通过提高内存带宽利用率来提高GPU性能。特别是,TEMP将共享同一组页面的多个线程块群集到一个线程批中,并将整个线程批分派给一个流多处理器。 TEMP通过OS内存管理将不同线程批次的内存访问流分开,从而保留线程批次的固有位置并增加内存访问并行性。实验结果表明,与没有进行任何内存侧优化的最新调度程序相比,TEMP可以提高多达10.3%的性能,并降低DRAM的14.6%的能耗。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号