Adaptive Cooperation of Prefetching and Warp Scheduling on GPUs

Oh Yunho; Kim Keunsoo; Yoon Myung Kuk; Park Jong Hyun; Park Yongjun; Annavaram Murali; Ro Won Woo

首页> 外文期刊>IEEE Transactions on Computers >Adaptive Cooperation of Prefetching and Warp Scheduling on GPUs

【24h】

Adaptive Cooperation of Prefetching and Warp Scheduling on GPUs

机译：GPU上预取和翘曲调度的自适应协作

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This paper proposes a new architecture, called Adaptive PREfetching and Scheduling (APRES), which improves cache efficiency of GPUs. APRES relies on the observation that GPU loads tend to have either high locality or strided access patterns across warps. APRES schedules warps so that as many cache hits are generated as possible before the generation of any cache miss. Without directly predicting future cache hits/misses for each warp, APRES creates a warp group that will execute the same static load shortly and prioritizes the grouped warps. If the first executed warp in the group hits the cache, grouped warps are likely to access the same cache lines. Unless, APRES considers the load as a strided type and generates prefetch requests for the grouped warps. In addition, APRES includes a new dynamic L1 prefetch and data cache partitioning to reduce contentions between demand-fetched and prefetched lines. In our evaluation, APRES achieves 27.8 percent performance improvement.

机译：本文提出了一种称为自适应预取和调度（APRES）的新架构，该架构可提高GPU的缓存效率。 APRES依赖于以下观察结果：GPU负载倾向于具有较高的局部性或跨扭曲的跨越式访问模式。 APRES调度扭曲，以便在生成任何高速缓存未命中之前尽可能多地生成高速缓存命中。在不直接预测每个扭曲的将来缓存命中/未命中的情况下，APRES创建了一个扭曲组，该扭曲组将很快执行相同的静态负载并确定分组扭曲的优先级。如果组中第一个执行的翘曲命中缓存，则分组的翘曲可能会访问相同的缓存行。除非，否则APRES会将负载视为跨步类型，并为分组的经线生成预取请求。此外，APRES包括一个新的动态L1预取和数据缓存分区，以减少按需提取和预取行之间的争用。在我们的评估中，APRES实现了27.8％的性能提升。

著录项

来源
《IEEE Transactions on Computers》 |2019年第4期|609-616|共8页
作者
Oh Yunho; Kim Keunsoo; Yoon Myung Kuk; Park Jong Hyun; Park Yongjun; Annavaram Murali; Ro Won Woo;
展开▼
作者单位

Yonsei Univ, Sch Elect & Elect Engn, Seoul 03722, South Korea;

Yonsei Univ, Sch Elect & Elect Engn, Seoul 03722, South Korea;

Yonsei Univ, Sch Elect & Elect Engn, Seoul 03722, South Korea;

Yonsei Univ, Sch Elect & Elect Engn, Seoul 03722, South Korea;

Hanyang Univ, Div Comp Sci & Engn, Seoul 04763, South Korea;

Univ Southern Calif, Ming Hsieh Dept Elect Engn, Los Angeles, CA 90007 USA;

Yonsei Univ, Sch Elect & Elect Engn, Seoul 03722, South Korea;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
GPU; cache; warp scheduling; data prefetching; performance;

机译：GPU;缓存;翘曲调度;数据预取;性能;

相似文献

外文文献
中文文献
专利

1. Adaptive Cooperation of Prefetching and Warp Scheduling on GPUs [J] . Oh Yunho, Kim Keunsoo, Yoon Myung Kuk, IEEE Transactions on Computers . 2019,第4期

机译：GPU上预取和扭曲调度的自适应合作
2. WASP: Selective Data Prefetching with Monitoring Runtime Warp Progress on GPUs [J] . Yunho Oh, Myung Kuk Yoon, Jong Hyun Park, Computers, IEEE Transactions on . 2018,第9期

机译：WASP：在GPU上监视运行时扭曲进度的选择性数据预取
3. CAWA: Coordinated Warp Scheduling and Cache Prioritization for Critical Warp Acceleration of GPGPU Workloads [J] . Shin-Ying Lee, Akhil Arunkumar, Carole-Jean Wu Computer architecture news . 2015,第3期

机译：CAWA：协调的翘曲调度和缓存优先级，用于GPGPU工作负载的关键翘曲加速
4. Optimization of Stride Prefetching Mechanism and Dependent Warp Scheduling on GPGPU [C] . Tsung-Han Tsou, Dun-Jie Chen, Sheng-Yang Hung, IEEE International Symposium on Circuits and Systems . 2020

机译：GPGPU的步幅预取机制优化和相关的翘曲调度
5. The Development of WARP - A Framework for Continuous Energy Monte Carlo Neutron Transport in General 3D Geometries on GPUs. [D] . Bergmann, Ryan. 2014

机译：WARP的开发-GPU上一般3D几何形状中连续能量蒙特卡洛中子传输的框架。
6. TIME-WARPED COMPARISON OF GENE EXPRESSION IN ADAPTIVE AND MALADAPTIVE CARDIAC HYPERTROPHY [O] . Sean P. Sheehy, Sui Huang, Kevin Kit Parker -1

机译：适应性和适应性心肌肥大基因表达的时差比较
7. Warp-Aware Trace Scheduling for GPUs [O] . James A. Jablin, Thomas B. Jablin, Onur Mutlu, 2014

机译：用于GpU的Warp-aware跟踪调度

Adaptive Cooperation of Prefetching and Warp Scheduling on GPUs

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅