首页> 外文期刊>Journal of supercomputing >Real-time parallel image processing applications on multicore CPUs with OpenMP and GPGPU with CUDA
【24h】

Real-time parallel image processing applications on multicore CPUs with OpenMP and GPGPU with CUDA

机译:带有OpenMP和带有CUDA的GPGPU的多核CPU上的实时并行图像处理应用程序

获取原文
获取原文并翻译 | 示例
           

摘要

This paper presents real-time image processing applications using multicore and multiprocessing technologies. To this end, parallel image segmentation was performed on many images covering the entire surface of the same metallic and cylindrical moving objects. Experimental results on multicore CPU with OpenMP platform showed that by increasing the chunk size, the execution time decreases approximately four times in comparison with serial computing. The same experiments were implemented on GPGPU using four techniques: (1) Single image transmission with single pixel processing; (2) Single image transmission with multiple pixel processing; (3) Multiple image transmission with single pixel processing; and (4) Multiple image transmission with multiple pixel processing. All techniques were implemented on GeForce, Tesla K20 and Tesla K40. Experimental results of GPU with CUDA platform showed that by increasing the core number speedup is increased. Tesla K40 gave the best results of 35 and 12 (for the first technique), 36 and 13 (for the second technique), 54 and 16 (for the third technique), 71 and 17 (for the fourth technique) times improvement without and with data transmission time in comparison with serial computing. As a result, users are suggested to use Tesla K40 GPU and Multiple image transmission with multiple pixel processing to get the maximum performance.
机译:本文介绍了使用多核和多处理技术的实时图像处理应用程序。为此,在覆盖相同金属和圆柱形移动物体的整个表面的许多图像上执行了平行图像分割。在具有OpenMP平台的多核CPU上的实验结果表明,通过增加块大小,与串行计算相比,执行时间减少了大约四倍。在GPGPU上使用四种技术进行了相同的实验:(1)具有单像素处理的单图像传输; (2)单图像传输,多像素处理; (3)多图像传输,单像素处理; (4)具有多像素处理的多图像传输。所有技术均在GeForce,Tesla K20和Tesla K40上实现。带有CUDA平台的GPU的实验结果表明,通过增加内核数,速度可以提高。特斯拉K40的最佳效果分别为35和12(对于第一技术),36和13(对于第二技术),54和16(对于第三技术),71和17(对于第四技术),而没有和与串行计算相比,数据传输时间更长。因此,建议用户使用Tesla K40 GPU和具有多像素处理功能的多图像传输来获得最佳性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号