Access Pattern-Aware Cache Management for Improving Data Utilization in GPU

Gunjae Koo; Yunho Oh; Won Woo Ro; Murali Annavaram

首页> 外文期刊>Computer architecture news >Access Pattern-Aware Cache Management for Improving Data Utilization in GPU

【24h】

Access Pattern-Aware Cache Management for Improving Data Utilization in GPU

机译：访问模式感知缓存管理，可提高GPU中的数据利用率

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Long latency of memory operation is a prominent performance bottleneck in graphics processing units (GPUs). The small data cache that must be shared across dozens of warps (a collection of threads) creates significant cache contention and premature data eviction. Prior works have recognized this problem and proposed warp throttling which reduces the number of active warps contending for cache space. In this paper we discover that individual load instructions in a warp exhibit four different types of data locality behavior: (1) data brought by a warp load instruction is used only once, which is classified as streaming data (2) data brought by a warp load is reused multiple times within the same warp, called intra-warp locality (3) data brought by a warp is reused multiple times but across different warps, called inter-warp locality (4) and some data exhibit both a mix of intra- and inter-warp locality. Furthermore, each load instruction exhibits consistently the same locality type across all warps within a GPU kernel. Based on this discovery we argue that cache management must be done using per-load locality type information, rather than applying warp-wide cache management policies. We propose Access Pattern-aware Cache Management (APCM), which dynamically detects the locality type of each load instruction by monitoring the accesses from one exemplary warp. APCM then uses the detected locality type to selectively apply cache bypassing and cache pinning of data based on load locality characterization. Using an extensive set of simulations we show that APCM improves performance of GPUs by 34% for cache sensitive applications while saving 27% of energy consumption over baseline GPU.

机译：内存操作的长等待时间是图形处理单元（GPU）的突出性能瓶颈。必须在数十个线程束（线程集合）之间共享的小型数据缓存会造成大量的缓存争用和过早的数据逐出。先前的工作已经认识到这个问题，并提出了经节流，这减少了争用高速缓存空间的活动经纱的数量。在本文中，我们发现扭曲中的各个加载指令表现出四种不同类型的数据局部性行为：（1）扭曲加载指令带来的数据仅使用一次，被分类为流数据（2）扭曲带来的数据负载在同一经线内被多次重用，称为经线内局部性（3），经线带来的数据可重复使用多次，但跨不同的经线跨度被称为经间局部性（4），并且某些数据同时显示以及局部变形。此外，每条加载指令在GPU内核中的所有扭曲上均展现出相同的局部性类型。基于此发现，我们认为缓存管理必须使用每个负载的位置类型信息来完成，而不是应用整个warp范围的缓存管理策略。我们提出了访问模式感知的缓存管理（APCM），它通过监视来自一个示例性翘曲的访问来动态检测每个加载指令的位置类型。然后，APCM使用检测到的局部性类型，根据负载局部性特征有选择地应用高速缓存绕过和数据的高速缓存固定。通过广泛的仿真，我们证明APCM在对缓存敏感的应用程序中将GPU的性能提高了34％，而与基准GPU相比，可节省27％的能耗。

著录项

来源
《Computer architecture news》 |2017年第2期|307-319|共13页
作者
Gunjae Koo; Yunho Oh; Won Woo Ro; Murali Annavaram;
展开▼
作者单位

University of Southern California;

Yonsei University;

Yonsei University;

University of Southern California;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
GPGPU; cache management; memory access patterns;

机译：GPGPU;缓存管理;内存访问模式;

相似文献

外文文献
中文文献
专利

1. Efficient Management of Cache Accesses to Boost GPGPU Memory Subsystem Performance [J] . Candel Francisco, Valero Alejandro, Petit Salvador, IEEE Transactions on Computers . 2019,第10期

机译：高效的缓存访问管理，以提高GPGPU内存子系统的性能
2. Efficient Management of Cache Accesses to Boost GPGPU Memory Subsystem Performance [J] . Candel Francisco, Valero Alejandro, Petit Salvador, IEEE Transactions on Computers . 2019,第10期

机译：高效管理缓存访问以提升GPGPU内存子系统性能
3. Improving Metadata Caching Eﬃciency for Data Deduplication via In-RAM Metadata Utilization [J] . Bing Zhou, Jiang-Tao Wen 计算机科学技术学报（英文版） . 2016,第004期

机译：通过RAM中的元数据利用来提高重复数据删除的元数据缓存效率
4. Access pattern-aware cache management for improving data utilization in GPU [C] . Gunjae Koo, Yunho Oh, Won Woo Ro, ACM/IEEE Annual International Symposium on Computer Architecture . 2017

机译：访问模式感知缓存管理，可提高GPU中的数据利用率
5. A Case For Effective Utilization Of Direct Cache Access For Big Data Workloads [D] . Basavaraj, Harsha. 2017

机译：有效利用大数据工作负载直接缓存访问的案例
6. The next frontier: Fostering innovation by improving health data access and utilization [O] . KA Oye, G Jain, M Amador, -1

机译：下一个领域：通过改善卫生数据的访问和利用来促进创新
7. Access Pattern-Aware Cache Management for Improving Data Utilization in GPU [O] . Gunjae Koo, Yunho Oh, Won Woo Ro, 2017

机译：访问模式感知缓存管理，以提高GPU中的数据利用率

Access Pattern-Aware Cache Management for Improving Data Utilization in GPU

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅