首页> 外文会议>2018 Symposium on High Performance Computing Systems >A Fast and Generic GPU-Based Parallel Reduction Implementation
【24h】

A Fast and Generic GPU-Based Parallel Reduction Implementation

机译:一种基于GPU的快速通用并行缩减实现

获取原文
获取原文并翻译 | 示例

摘要

Reduction operations are extensively employed in many computational problems when a finite set of numeric elements are combined into a single value using for this a combining function. A parallel reduction, in turn, is the operation concurrently performed when multiple execution units are avai-lable. The present work depicts a GPU-based parallel approach for it, which employs techniques like loop unrolling, persistent threads and algebraic expressions to avoid thread divergence, able to surpass the methods currently in use. Experiments conducted to evaluate the approach show that the strategy performs efficiently on both AMD and NVidia's hardware platforms, as well as using OpenCL and CUDA, making it portable.
机译:当将有限数量的数值元素用于此组合函数组合成单个值时,归约运算被广泛用于许多计算问题中。反过来,并行缩减是在多个执行单元可用时同时执行的操作。本工作描述了一种基于GPU的并行方法,该方法采用了诸如循环展开,持久线程和代数表达式之类的技术来避免线程发散,从而能够超越当前使用的方法。为评估该方法而进行的实验表明,该策略在AMD和NVidia的硬件平台上以及使用OpenCL和CUDA均可高效执行,从而使其可移植。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号