...
首页> 外文期刊>Computers & geosciences >A comparison of native GPU computing versus OpenACC for implementing flow-routing algorithms in hydrological applications
【24h】

A comparison of native GPU computing versus OpenACC for implementing flow-routing algorithms in hydrological applications

机译:在水文应用中实现流路由算法的本地GPU计算与OpenACC的比较

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In recent years GPU computing has gained wide acceptance as a simple low-cost solution for speeding up computationally expensive processing in many scientific and engineering applications. However, in most cases accelerating a traditional CPU implementation for a GPU is a non-trivial task that requires a thorough refactorization of the code and specific optimizations that depend on the architecture of the device. OpenACC is a promising technology that aims at reducing the effort required to accelerate C/C+ +/Fortran code on an attached multicore device. Virtually with this technology the CPU code only has to be augmented with a few compiler directives to identify the areas to be accelerated and the way in which data has to be moved between the CPU and GPU. Its potential benefits are multiple: better code readability, less development time, lower risk of errors and less dependency on the underlying architecture and future evolution of the GPU technology. Our aim with this work is to evaluate the pros and cons of using OpenACC against native GPU implementations in computationally expensive hydrological applications, using the classic D8 algorithm of O'Callaghan and Mark for river network extraction as case-study. We implemented the flow accumulation step of this algorithm in CPU, using OpenACC and two different CUDA versions, comparing the length and complexity of the code and its performance with different datasets. We advance that although OpenACC can not match the performance of a CUDA optimized implementation (x 3.5 slower in average), it provides a significant performance improvement against a CPU implementation (x 2-6) with by far a simpler code and less implementation effort. (C) 2015 Elsevier Ltd. All rights reserved.
机译:近年来,GPU计算作为一种简单的低成本解决方案已得到广泛认可,该解决方案可在许多科学和工程应用中加速计算昂贵的处理。但是,在大多数情况下,加速GPU的传统CPU实现是一项艰巨的任务,需要对代码进行彻底的重构并根据设备的架构进行特定的优化。 OpenACC是一项很有前途的技术,旨在减少在连接的多核设备上加速C / C ++ / Fortran代码所需的工作量。实际上,利用这项技术,只需用一些编译器指令来扩展CPU代码,即可确定要加速的区域以及在CPU和GPU之间移动数据的方式。它的潜在好处是多方面的:更好的代码可读性,更少的开发时间,更低的错误风险以及对基础架构和GPU技术的未来发展的依赖性更低。我们的工作目的是使用O'Callaghan和Mark的经典D8算法进行河网提取作为案例研究,来评估在计算成本高昂的水文应用中使用OpenACC相对于本机GPU实施的优缺点。我们使用OpenACC和两个不同的CUDA版本在CPU中实现了该算法的流量累积步骤,比较了代码的长度和复杂度以及其在不同数据集下的性能。我们提出,尽管OpenACC不能与CUDA优化实现的性能相提并论(平均慢3.5倍),但相对于CPU实现(2-6倍),它的性能有了显着提高,并且代码更简单,实现工作量也更少。 (C)2015 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号