首页> 外文会议>2017 IEEE 23rd Symposium on High Performance Computer Architecture >Pilot Register File: Energy Efficient Partitioned Register File for GPUs
【24h】

Pilot Register File: Energy Efficient Partitioned Register File for GPUs

机译:试点寄存器文件:适用于GPU的节能分区寄存器文件

获取原文
获取原文并翻译 | 示例

摘要

GPU adoption for general purpose computing hasbeen accelerating. To support a large number of concurrentlyactive threads, GPUs are provisioned with a very large registerfile (RF). The RF power consumption is a critical concern. Oneoption to reduce the power consumption dramatically is touse near-threshold voltage(NTV) to operate the RF. However, operating MOSFET devices at NTV is fraught with stabilityand reliability concerns. The adoption of FinFET devices inchip industry is providing a promising path to operate theRF at NTV while satisfactorily tackling the stability andreliability concerns. However, the fundamental problem of NTVoperation, namely slow access latency, remains. To tackle thischallenge in this paper we propose to build a partitioned RFusing FinFET technology. The partitioned RF design exploitsour observation that applications exhibit strong preference toutilize a small subset of their registers. One way to exploitthis behavior is to cache the RF content as has been proposedin recent works. However, caching leads to unnecessary areaoverheads since a fraction of the RF must be replicated. Furthermore, we show that caching is not efficient as weincrease the number of issued instructions per cycle, which isthe expected trend in GPU designs. The proposed partitionedRF splits the registers into two partitions: the highly accessedregisters are stored in a small RF that switches betweenhigh and low power modes. We use the FinFET's back gatecontrol to provide low overhead switching between the twopower modes. The remaining registers are stored in a largeRF partition that always operates at NTV. The assignment ofthe registers to the two partitions will be based on statisticscollected by the a hybrid profiling technique that combines thecompiler based profiling and the pilot warp profiling techniqueproposed in this paper. The partitioned FinFET RF is able tosave 39% and 54% of the RF leakage and the dynamic energy, respectively, and suffers less than 2% performance overhead.
机译:通用计算的GPU应用正在加速发展。为了支持大量并发线程,GPU配备了非常大的寄存器文件(RF)。射频功耗是一个至关重要的问题。大幅降低功耗的一种选择是使用接近阈值电压(NTV)来操作RF。但是,在NTV下运行MOSFET器件充满稳定性和可靠性问题。 FinFET器件在芯片行业的采用为在NTV下运行RF提供了一条有希望的途径,同时令人满意地解决了稳定性和可靠性问题。但是,NTV操作的基本问题,即访问延迟慢,仍然存在。为了解决这一挑战,我们建议使用FinFET技术构建一个分区的RF。分区RF设计利用了我们的观察,即应用程序表现出强烈的偏爱来利用其寄存器的一小部分。利用这种行为的一种方法是,如最近的工作中所建议的那样,缓存RF内容。但是,由于必须复制一部分RF,因此缓存会导致不必要的区域开销。此外,我们表明,随着我们增加每个周期的已发布指令数量,缓存并不高效,这是GPU设计中的预期趋势。提议的partitionedRF将寄存器分为两个分区:高度访问的寄存器存储在小型RF中,该RF在高功率模式和低功率模式之间切换。我们使用FinFET的背栅控制来提供两种功率模式之间的低开销切换。其余寄存器存储在始终以NTV运行的largeRF分区中。将寄存器分配给两个分区将基于混合分析技术收集的统计信息,该技术将本文提出的基于编译器的分析与试点翘曲分析技术相结合。分割后的FinFET RF能够分别节省39%和54%的RF泄漏和动态能量,并且性能开销不到2%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号