...
首页> 外文期刊>Journal of supercomputing >A quantitative evaluation of unified memory in GPUs
【24h】

A quantitative evaluation of unified memory in GPUs

机译:GPU中统一内存的定量评估

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

The introduction of unified memory and demand paging has simplified programming of graphics processing units (GPUs). It has also enabled oversubscribing the memory for a GPU. However, the overhead of page management makes page faults a performance bottleneck. Sometimes the page eviction policy is unable to mitigate performance slowdown caused by page faults and memory oversubscription. On average, eviction policies such as Random and CAR are not competitive with a traditional least recently used (LRU) policy. Other policies, such as CLOCK-Pro, are designed to overcome challenges with LRU, but they only achieve limited speedup. Even enhancing LRU with page walk hit information does not lead to notable performance improvement. Based on these observations, we propose optimization opportunities to mitigate performance degradation caused by page faults and memory oversubscription. These optimization opportunities include an effective page eviction policy that retains LRU's advantages while addressing LRU's inability to deal with thrashing access patterns, page prefetch and pre-eviction, memory-aware throttling, and capacity compression.
机译:统一内存和需求分页的引入已经简化了图形处理单元(GPU)的编程。它还支持以GPU为GPU的内存超额提交。但是,页面管理的开销使页面错误成为性能瓶颈。有时,页面驱逐策略无法减轻页面故障和内存超额订阅造成的性能放缓。平均而言,随机和汽车等驱逐政策与传统最近使用(LRU)政策不竞争。 Clock-Pro等其他政策旨在克服LRU的挑战,但它们只能实现有限的加速。甚至增强LRU与页面步行命中信息也不会导致显着的性能改进。基于这些观察,我们提出了优化机会来减轻页面故障和内存超额认购引起的性能下降。这些优化机会包括一个有效的页面驱逐政策,在解决LRU无法处理捶打访问模式,页面预取和预驱动,内存感知的限制和容量压缩时,该策略保留了LRU的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号