首页> 外文会议>ACM SIGPLAN Symposium on Priciples and Practice of Parallel Programming >StreamScan: Fast Scan Algorithms for GPUs without Global Barrier Synchronization
【24h】

StreamScan: Fast Scan Algorithms for GPUs without Global Barrier Synchronization

机译:StreamScan:没有全局屏障同步的GPU的快速扫描算法

获取原文
获取外文期刊封面目录资料

摘要

Scan (also known as prefix sum) is a very useful primitive for various important parallel algorithms, such as sort, BFS, SpMV, compaction and so on. Current state of the art of GPU based scan implementation consists of three consecutive Reduce-Scan-Scan phases. This approach requires at least two global barriers and 3N (N is the problem size) global memory accesses. In this paper we propose StreamScan, a novel approach to implement scan on GPUs with only one computation phase. The main idea is to restrict synchronization to only adjacent workgroups, and thereby eliminating global barrier synchronization completely. The new approach requires only 2N global memory accesses and just one kernel invocation. On top of this we propose two important optimizations to further boost performance speedups, namely thread grouping to eliminate unnecessary local barriers, and register optimization to expand the on chip problem size. We designed an auto-tuning framework to search the parameter space automatically to generate highly optimized codes for both AMD and Nvidia GPUs. We implemented our technique with OpenCL. Compared with previous fast scan implementations, experimental results not only show promising performance speedups, but also reveal dramatic different optimization tradeoffs between Nvidia and AMD GPU platforms.
机译:扫描(也称为前缀)是各种重要的并行算法的一个非常有用的原始原始,例如Sort,BFS,SPMV,压实等。基于GPU的扫描实现的本领域的状态包括三个连续的减少扫描扫描阶段。这种方法需要至少两个全局障碍和3N(n是问题大小)全局存储器访问。在本文中,我们提出了一种在GPU上实现扫描的新方法,只有一个计算阶段。主要思想是将同步限制为仅相邻的工作组,从而完全消除全局屏障同步。新方法只需要2N全局内存访问,只需一个内核调用。在此之上,我们提出了两个重要的优化,以进一步提高性能加速,即线程分组以消除不必要的本地障碍,并注册优化以扩展芯片问题大小。我们设计了一个自动调整框架,可自动搜索参数空间以为AMD和NVIDIA GPU生成高度优化的代码。我们使用OpenCL实现了我们的技术。与先前的快速扫描实现相比,实验结果不仅显示出具有前景的性能加速,而且还揭示了NVIDIA和AMD GPU平台之间的戏剧性不同优化权衡。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号