Adaptive Cooperation of Prefetching and Warp Scheduling on GPUs

Oh Yunho; Kim Keunsoo; Yoon Myung Kuk; Park Jong Hyun; Park Yongjun; Annavaram Murali; Ro Won Woo

首页> 外文期刊>IEEE Transactions on Computers >Adaptive Cooperation of Prefetching and Warp Scheduling on GPUs

【24h】

Adaptive Cooperation of Prefetching and Warp Scheduling on GPUs

机译：GPU上预取和扭曲调度的自适应合作

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This paper proposes a new architecture, called Adaptive PREfetching and Scheduling (APRES), which improves cache efficiency of GPUs. APRES relies on the observation that GPU loads tend to have either high locality or strided access patterns across warps. APRES schedules warps so that as many cache hits are generated as possible before the generation of any cache miss. Without directly predicting future cache hits/misses for each warp, APRES creates a warp group that will execute the same static load shortly and prioritizes the grouped warps. If the first executed warp in the group hits the cache, grouped warps are likely to access the same cache lines. Unless, APRES considers the load as a strided type and generates prefetch requests for the grouped warps. In addition, APRES includes a new dynamic L1 prefetch and data cache partitioning to reduce contentions between demand-fetched and prefetched lines. In our evaluation, APRES achieves 27.8 percent performance improvement.

机译：本文提出了一种新的架构，称为自适应预取和调度（APRES），从而提高了GPU的缓存效率。 APRES依赖于观察到GPU负载倾向于具有跨越Warps的高地位或进入访问模式。 APRES计划扭曲，以便在生成任何缓存未命中时，尽可能多地生成缓存命中。如果不直接预测每个扭曲的未来缓存命中/未命中，则APRES会创建一个扭曲组，即将执行相同的静态负载并优先考虑分组的扭曲。如果组中的第一个执行的翘曲命中缓存，则分组的扭曲可能会访问相同的缓存行。除非，APRES将负载视为string类型，并为分组的扭曲生成预取请求。此外，APRES包括一个新的动态L1预取和数据缓存分区，以减少需求获取和预取行之间的竞争。在我们的评估中，APRES实现了27.8％的绩效改进。

著录项

来源
《IEEE Transactions on Computers》 |2019年第4期|609-616|共8页
作者
Oh Yunho; Kim Keunsoo; Yoon Myung Kuk; Park Jong Hyun; Park Yongjun; Annavaram Murali; Ro Won Woo;
展开▼
作者单位

Yonsei Univ Sch Elect & Elect Engn Seoul 03722 South Korea;

Yonsei Univ Sch Elect & Elect Engn Seoul 03722 South Korea;

Yonsei Univ Sch Elect & Elect Engn Seoul 03722 South Korea;

Yonsei Univ Sch Elect & Elect Engn Seoul 03722 South Korea;

Hanyang Univ Div Comp Sci & Engn Seoul 04763 South Korea;

Univ Southern Calif Ming Hsieh Dept Elect Engn Los Angeles CA 90007 USA;

Yonsei Univ Sch Elect & Elect Engn Seoul 03722 South Korea;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
GPU; cache; warp scheduling; data prefetching; performance;

机译：GPU;缓存;翘曲调度;数据预取;性能;

相似文献

外文文献
中文文献
专利

1. Adaptive Cooperation of Prefetching and Warp Scheduling on GPUs [J] . Oh Yunho, Kim Keunsoo, Yoon Myung Kuk, IEEE Transactions on Computers . 2019,第4期

机译：GPU上预取和翘曲调度的自适应协作
2. WASP: Selective Data Prefetching with Monitoring Runtime Warp Progress on GPUs [J] . Yunho Oh, Myung Kuk Yoon, Jong Hyun Park, Computers, IEEE Transactions on . 2018,第9期

机译：WASP：在GPU上监视运行时扭曲进度的选择性数据预取
3. CAWA: Coordinated Warp Scheduling and Cache Prioritization for Critical Warp Acceleration of GPGPU Workloads [J] . Shin-Ying Lee, Akhil Arunkumar, Carole-Jean Wu Computer architecture news . 2015,第3期

机译：CAWA：协调的翘曲调度和缓存优先级，用于GPGPU工作负载的关键翘曲加速
4. Optimization of Stride Prefetching Mechanism and Dependent Warp Scheduling on GPGPU [C] . Tsung-Han Tsou, Dun-Jie Chen, Sheng-Yang Hung, IEEE International Symposium on Circuits and Systems . 2020

机译：GPGPU的步幅预取机制优化和相关的翘曲调度
5. The Development of WARP - A Framework for Continuous Energy Monte Carlo Neutron Transport in General 3D Geometries on GPUs. [D] . Bergmann, Ryan. 2014

机译：WARP的开发-GPU上一般3D几何形状中连续能量蒙特卡洛中子传输的框架。
6. TIME-WARPED COMPARISON OF GENE EXPRESSION IN ADAPTIVE AND MALADAPTIVE CARDIAC HYPERTROPHY [O] . Sean P. Sheehy, Sui Huang, Kevin Kit Parker -1

机译：适应性和适应性心肌肥大基因表达的时差比较
7. Warp-Aware Trace Scheduling for GPUs [O] . James A. Jablin, Thomas B. Jablin, Onur Mutlu, 2014

机译：用于GpU的Warp-aware跟踪调度

Adaptive Cooperation of Prefetching and Warp Scheduling on GPUs

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅