首页> 外文会议>IEEE International Performance Computing and Communications Conference >Optimized GPU implementation for dynamic programming in image data processing
【24h】

Optimized GPU implementation for dynamic programming in image data processing

机译:针对图像数据处理中的动态编程的优化GPU实现

获取原文

摘要

It is a trend now that computing power through parallelism is provided by multi-core systems or heterogeneous architectures for High Performance Computing (HPC) and scientific computing. Although many algorithms have been proposed and implemented using sequential computing, alternative parallel solutions provide more suitable and high performance solutions to the same problems. In this paper, three parallelization strategies are proposed and implemented for a dynamic programming based cloud smoothing application, using both shared memory and non-shared memory approaches. The experiments are performed on NVIDIA GeForce GT750m and Tesla K20m, two GPU accelerators of Kepler architecture. Detailed performance analysis is presented on partition granularity at block and thread levels, memory access efficiency and computational complexity. The evaluations described show high approximation of results with high efficiency in the parallel implementations, and these strategies can be adopted in similar data analysis and processing applications.
机译:现在的趋势是,通过多核系统或异构体系结构提供用于高性能计算(HPC)和科学计算的并行计算能力。尽管已经提出和使用顺序计算来实现许多算法,但是替代的并行解决方案为相同的问题提供了更合适的高性能解决方案。本文提出了三种并行化策略,并使用共享内存和非共享内存方法为基于动态编程的云平滑应用程序提供了三种并行化策略。实验在NVIDIA GeForce GT750m和Tesla K20m(开普勒架构的两个GPU加速器)上进行。在块和线程级别的分区粒度,内存访问效率和计算复杂性方面,将进行详细的性能分析。所描述的评估在并行实现中显示了高效的结果近似值,并且这些策略可以在类似的数据分析和处理应用中采用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号