Energy Efficient Affine Register File for GPU Microarchitecture

机译：用于GPU微体系结构的节能仿射寄存器文件

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

As modern GPUs can accommodate thousands of hardware threads, each of which has its own dedicated register file for fast context switching, to achieve high throughput and performance, power consumption has become an important issue. It has been observed that many SIMD groups in GPU execute with the same input values and generate the same output values, and hence uniform/scalar register files of GPUs have been proposed to eliminate these redundant computations and memory accesses for these scalar executions. In this paper, we propose the affine register file design for GPUs to reduce the redundant executions as the input values are the uniform and affine patterns. We use a pair of registers, a base and a stride, to store affine vector and specific affine ALUs to execute affine instructions. Compiler performs analysis to detect the affine vectors and instructions and adds the annotations for these non-vector computations. Moreover, if the operation cannot keep the value in affine style, the compiler-helped hardware conversion mechanism will translate the affine vector into general vector. In Our evaluations, it shows that our design can reduce vector computation rate to 44.85% and 55.15% of computation rate dispatched to scalar and affine computation. Our design can also reduce approximately 66.84% energy consumption of register files, 38.67% energy consumption of ALUs and average 4.78% of total energy consumption of GPU.

机译：由于现代GPU可以容纳数千个硬件线程，每个线程都有自己的专用寄存器文件用于快速上下文切换，以实现高吞吐量和性能，因此功耗已成为一个重要问题。已经观察到，GPU中的许多SIMD组以相同的输入值执行并生成相同的输出值，因此，已经提出了GPU的统一/标量寄存器文件，以消除这些标量执行的这些冗余计算和内存访问。在本文中，我们提出了用于GPU的仿射寄存器文件设计，以减少冗余执行，因为输入值是统一和仿射模式。我们使用一对寄存器（一个基址和一个跨步）来存储仿射向量和特定的仿射ALU来执行仿射指令。编译器执行分析以检测仿射矢量和指令，并为这些非矢量计算添加注释。此外，如果操作无法将值保留为仿射样式，则由编译器帮助的硬件转换机制会将仿射向量转换为通用向量。在我们的评估中，它表明我们的设计可以将向量计算率降低到分配给标量和仿射计算的计算率的44.85％和55.15％。我们的设计还可以减少大约66.84％的寄存器文件能耗，38.67％的ALU能耗和平均4.78％的GPU总能耗。

著录项

来源
《International Conference on Parallel Processing Workshops》|2016年|52-58|共7页
会议地点
作者
Shao-Chung Wang; Li-Chen Kan; Yuan-Shin Hwang; Jenq-Kuen Lee;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Registers; Graphics processing units; Instruction sets; Computational modeling; Hardware; Energy consumption; Computer architecture;

机译：寄存器;图形处理单元;指令集;计算建模;硬件;能耗;计算机体系结构;

相似文献

外文文献
中文文献
专利

1. Architecture and Compiler Support for GPUs Using Energy-Efficient Affine Register Files [J] . Wang Shao-Chung, Kan Li-Chen, Lee Chao-Lin, ACM Transactions on Design Automation of Electronic Systems . 2018,第2期

机译：使用节能仿射寄存器文件对GPU的体系结构和编译器支持
2. AN ENERGY-EFFICIENT DEMAND-DRIVEN REGISTER FILE FOR MOBILE GPUs [J] . CHIH-CHIEH HSIAO, CHIU-CHENG HSIEH, SLO-LI CHU Journal of circuits, systems and computers . 2014,第2期

机译：用于移动GPU的高能效需求驱动的寄存器文件
3. An Energy-Efficient GPGPU Register File Architecture Using Racetrack Memory [J] . Mengjie Mao, Wujie Wen, Yaojun Zhang, IEEE Transactions on Computers . 2017,第9期

机译：使用Racetrack内存的节能GPGPU寄存器文件架构
4. Energy Efficient Affine Register File for GPU Microarchitecture [C] . Shao-Chung Wang, Li-Chen Kan, Yuan-Shin Hwang, International Workshop on Embedded Multicore Systems . 2016

机译：GPU微架构的节能仿射寄存器文件
5. Static analysis for efficient affine arithmetic on GPUs. [D] . Chan, Bryan. 2008

机译：静态分析，可在GPU上进行高效的仿射算法。
6. GPUmotif: An Ultra-Fast and Energy-Efficient Motif Analysis Program Using Graphics Processing Units [O] . Pooya Zandevakili, Ming Hu, Zhaohui Qin 2009

机译：GPUmotif：使用图形处理单元的超快速节能型母题分析程序
7. Hi-End: Hierarchical, Endurance-Aware STT-MRAM-Based Register File for Energy-Efficient GPUs [O] . Won Jeon, Jun Hyun Park, Yoonsoo Kim, 2020

机译：Hi-End：基于分层的，耐用的STT-MRAM的寄存器文件，用于节能GPU

Energy Efficient Affine Register File for GPU Microarchitecture

摘要

著录项

相似文献

相关主题

期刊订阅