首页> 外文会议>International Symposium on Multidisciplinary Studies and Innovative Technologies >BOW: Breathing Operand Windows to Exploit Bypassing in GPUs
【24h】

BOW: Breathing Operand Windows to Exploit Bypassing in GPUs

机译:弓:呼吸操作数窗口以利用GPU中的绕过

获取原文

摘要

The Register File (RF) is a critical structure in Graphics Processing Units (GPUs) responsible for a large portion of the area and power. To simplify the architecture of the RF, it is organized in a multi-bank configuration with a single port for each bank. Not surprisingly, the frequent accesses to the register file during kernel execution incur a sizeable overhead in GPU power consumption, and introduce delays as accesses are serialized when port conflicts occur. In this paper, we observe that there is a high degree of temporal locality in accesses to the registers: within short instruction windows, the same registers are often accessed repeatedly. We characterize the opportunities to reduce register accesses as a function of the size of the instruction window considered, and establish that there are many recurring reads and updates of the same register operands in most GPU computations. To exploit this opportunity, we propose Breathing Operand Windows (BOW), an enhanced GPU pipeline and operand collector organization that supports bypassing register file accesses and instead passes values directly between instructions within the same window. Our baseline design can only bypass register reads; we introduce an improved design capable of also bypassing unnecessary write operations to the RF. We introduce compiler optimizations to help guide the write-back destination of operands depending on whether they will be reused to further reduce the write traffic. To reduce the storage overhead, we analyze the occupancy of the bypass buffers and discover that we can significantly down size them without losing performance. BOW along with optimizations reduces dynamic energy consumption of the register file by 55% and increases the performance by 11%, with a modest overhead of 12KB increase in the size of the operand collectors (4% of the register file size).
机译:寄存器文件(RF)是图形处理单元(GPU)中的重要结构,负责大部分面积和功耗。为了简化RF的体系结构,将其组织为多组配置,每个组具有一个端口。毫不奇怪,内核执行过程中对寄存器文件的频繁访问会导致GPU功耗相当大的开销,并且会在端口冲突发生时对访问进行序列化,从而导致延迟。在本文中,我们观察到对寄存器的访问存在高度的时间局部性:在较短的指令窗口内,经常重复访问相同的寄存器。我们将减少寄存器访问的机会描述为所考虑的指令窗口大小的函数,并确定在大多数GPU计算中有许多重复读取和更新相同寄存器操作数的情况。为了利用这一机会,我们提出了呼吸操作数窗口(BOW),这是一个增强的GPU管道和操作数收集器组织,它支持绕过寄存器文件访问,而是直接在同一窗口内的指令之间传递值。我们的基准设计只能绕过寄存器读取。我们介绍了一种改进的设计,该设计还能够绕过RF不必要的写入操作。我们介绍编译器优化,以帮助指导操作数的写回目的地,具体取决于操作数是否将被重用以进一步减少写流量。为了减少存储开销,我们分析了旁路缓冲区的占用情况,发现可以在不损失性能的情况下大幅缩减它们的大小。 BOW以及优化使寄存器文件的动态能耗降低了55%,性能提高了11%,操作数收集器的大小(寄存器文件大小的4%)增加了12KB,这是适度的开销。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号