Get Out of the Valley: Power-Efficient Address Mapping for GPUs

机译：走出低谷：GPU的节能地址映射

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

GPU memory systems adopt a multi-dimensional hardware structure to provide the bandwidth necessary to support 100s to 1000s of concurrent threads. On the software side, GPU-compute workloads also use multi-dimensional structures to organize the threads. We observe that these structures can combine unfavorably and create significant resource imbalance in the memory subsystem - causing low performance and poor power-efficiency. The key issue is that it is highly application-dependent which memory address bits exhibit high variability. To solve this problem, we first provide an entropy analysis approach tailored for the highly concurrent memory request behavior in GPU-compute workloads. Our window-based entropy metric captures the information content of each address bit of the memory requests that are likely to co-exist in the memory system at runtime. Using this metric, we find that GPU-compute workloads exhibit entropy valleys distributed throughout the lower order address bits. This indicates that efficient GPU-address mapping schemes need to harvest entropy from broad address-bit ranges and concentrate the entropy into the bits used for channel and bank selection in the memory subsystem. This insight leads us to propose the Page Address Entropy (PAE) mapping scheme which concentrates the entropy of the row, channel and bank bits of the input address into the bank and channel bits of the output address. PAE maps straightforwardly to hardware and can be implemented with a tree of XOR-gates. PAE improves performance by 1.31X and power-efficiency by 1.25X compared to state-of-the-art permutation-based address mapping.

机译：GPU内存系统采用多维硬件结构，以提供支持100到1000s并发线程所需的带宽。在软件方面，GPU计算工作负载还使用多维结构来组织线程。我们观察到，这些结构可能会不利地组合在一起，并在内存子系统中造成严重的资源失衡，从而导致性能低下和电源效率低下。关键问题是哪个存储地址位表现出高度的可变性与应用程序高度相关。为了解决这个问题，我们首先提供一种针对GPU计算工作负载中的高度并发内存请求行为量身定制的熵分析方法。我们基于窗口的熵度量可捕获可能在运行时共存于内存系统中的内存请求的每个地址位的信息内容。使用此度量，我们发现GPU计算的工作负载表现出分布在整个低阶地址位中的熵谷。这表明有效的GPU地址映射方案需要从较宽的地址位范围内收集熵，并将熵集中到内存子系统中用于通道和存储体选择的位中。这种见解使我们提出了页面地址熵（PAE）映射方案，该方案将输入地址的行，通道和存储体位的熵集中到输出地址的存储体和通道位中。 PAE可以直接映射到硬件，并且可以用XOR门树来实现。与最新的基于置换的地址映射相比，PAE的性能提高了1.31倍，功率效率提高了1.25倍。

著录项

来源
《ACM/IEEE Annual International Symposium on Computer Architecture》|2018年|166-179|共14页
会议地点
作者
Yuxi Liu; Xia Zhao; Magnus Jahre; Zhenlin Wang; Xiaolin Wang; Yingwei Luo; Lieven Eeckhout;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Entropy; Instruction sets; Graphics processing units; Random access memory; Hardware; Measurement; Organizations;

机译：熵;指令集;图形处理单元;随机存取存储器;硬件;测量;组织;

相似文献

外文文献
中文文献
专利

1. Architectural Support for Address Translation on GPUs Designing Memory Management Units for CPU/GPUs with Unified Address Spaces [J] . Bharath Pichai, Lisa Hsu, Abhishek Bhattacharjee ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2014,第4期

机译：对GPU上的地址转换的架构支持为具有统一地址空间的CPU / GPU设计内存管理单元
2. Buck Converter Makes Power-Efficient, Robust Cockpit GPUs [J] . Jerome Johnston Asia electronics industry . 2016,第11期

机译：降压转换器使能效高，功能强大的座舱GPU
3. EcoG: A Power-Efficient GPU Cluster Architecture for Scientific Computing [J] . Mike Showerman, Jeremy Enos, Craig Steffen, Computing in science & engineering . 2011,第2期

机译：EcoG：用于科学计算的高效GPU集群架构
4. Get Out of the Valley: Power-Efficient Address Mapping for GPUs [C] . Yuxi Liu, Xia Zhao, Magnus Jahre, ACM/IEEE Annual International Symposium on Computer Architecture . 2018

机译：离开山谷：GPU的功能高效地址映射
5. Optimization techniques for mapping algorithms and applications onto CUDA GPU platforms and CPU-GPU heterogeneous platforms. [D] . Wu, Jing. 2014

机译：用于将算法和应用程序映射到CUDA GPU平台和CPU-GPU异构平台的优化技术。
6. Efficient Probabilistic and Geometric Anatomical Mapping Using Particle Mesh Approximation on GPUs [O] . Linh Ha, Marcel Prastawa, Guido Gerig, 2011

机译：在GPU上使用粒子网格近似进行高效的概率和几何解剖映射
7. Address-stride assisted approximate load value prediction in GPUs [O] . Haonan Wang, Mohamed Ibrahim, Sparsh Mittal, 2019

机译：地址 - 跨越GPU中的近似负载值预测

Get Out of the Valley: Power-Efficient Address Mapping for GPUs

摘要

著录项

相似文献

相关主题

期刊订阅