...
首页> 外文期刊>ACM Transactions on Architecture and Code Optimization >MAPS: Optimizing Massively Parallel Applications Using Device-Level Memory Abstraction
【24h】

MAPS: Optimizing Massively Parallel Applications Using Device-Level Memory Abstraction

机译:MAPS:使用设备级内存抽象优化大规模并行应用程序

获取原文
获取原文并翻译 | 示例
           

摘要

GPUs play an increasingly important role in high-performance computing. While developing naive code is straightforward, optimizing massively parallel applications requires deep understanding of the underlying architecture. The developer must struggle with complex index calculations and manual memory transfers. This article classifies memory access patterns used in most parallel algorithms, based on Berkeley's Parallel "Dwarfs." It then proposes the MAPS framework, a device-level memory abstraction that facilitates memory access on GPUs, alleviating complex indexing using on-device containers and iterators. This article presents an implementation of MAPS and shows that its performance is comparable to carefully optimized implementations of real-world applications.
机译:GPU在高性能计算中扮演着越来越重要的角色。尽管开发朴素代码很简单,但是优化大规模并行应用程序需要深刻理解底层体系结构。开发人员必须为复杂的索引计算和手动内存传输而苦恼。本文根据伯克利的Parallel“ Dwarfs”对大多数并行算法中使用的内存访问模式进行分类。然后,它提出了MAPS框架,这是一种设备级别的内存抽象,可促进GPU上的内存访问,并使用设备上的容器和迭代器减轻复杂的索引编制。本文介绍了MAPS的实现,并显示了其性能与实际应用程序中经过精心优化的实现相当。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号