首页> 外文会议>International Conference on Parallel Processing Workshops >Energy Efficient Affine Register File for GPU Microarchitecture
【24h】

Energy Efficient Affine Register File for GPU Microarchitecture

机译:用于GPU微体系结构的节能仿射寄存器文件

获取原文

摘要

As modern GPUs can accommodate thousands of hardware threads, each of which has its own dedicated register file for fast context switching, to achieve high throughput and performance, power consumption has become an important issue. It has been observed that many SIMD groups in GPU execute with the same input values and generate the same output values, and hence uniform/scalar register files of GPUs have been proposed to eliminate these redundant computations and memory accesses for these scalar executions. In this paper, we propose the affine register file design for GPUs to reduce the redundant executions as the input values are the uniform and affine patterns. We use a pair of registers, a base and a stride, to store affine vector and specific affine ALUs to execute affine instructions. Compiler performs analysis to detect the affine vectors and instructions and adds the annotations for these non-vector computations. Moreover, if the operation cannot keep the value in affine style, the compiler-helped hardware conversion mechanism will translate the affine vector into general vector. In Our evaluations, it shows that our design can reduce vector computation rate to 44.85% and 55.15% of computation rate dispatched to scalar and affine computation. Our design can also reduce approximately 66.84% energy consumption of register files, 38.67% energy consumption of ALUs and average 4.78% of total energy consumption of GPU.
机译:由于现代GPU可以容纳数千个硬件线程,每个线程都有自己的专用寄存器文件用于快速上下文切换,以实现高吞吐量和性能,因此功耗已成为一个重要问题。已经观察到,GPU中的许多SIMD组以相同的输入值执行并生成相同的输出值,因此,已经提出了GPU的统一/标量寄存器文件,以消除这些标量执行的这些冗余计算和内存访问。在本文中,我们提出了用于GPU的仿射寄存器文件设计,以减少冗余执行,因为输入值是统一和仿射模式。我们使用一对寄存器(一个基址和一个跨步)来存储仿射向量和特定的仿射ALU来执行仿射指令。编译器执行分析以检测仿射矢量和指令,并为这些非矢量计算添加注释。此外,如果操作无法将值保留为仿射样式,则由编译器帮助的硬件转换机制会将仿射向量转换为通用向量。在我们的评估中,它表明我们的设计可以将向量计算率降低到分配给标量和仿射计算的计算率的44.85%和55.15%。我们的设计还可以减少大约66.84%的寄存器文件能耗,38.67%的ALU能耗和平均4.78%的GPU总能耗。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号