首页> 外文期刊>Journal of electronic imaging >Adaptive discrete cosine transform-based image compression method on a heterogeneous system platform using Open Computing Language
【24h】

Adaptive discrete cosine transform-based image compression method on a heterogeneous system platform using Open Computing Language

机译:使用开放计算语言的异构系统平台上基于自适应离散余弦变换的图像压缩方法

获取原文
获取原文并翻译 | 示例
       

摘要

Discrete cosine transform (DCT) is one of the major operations in image compression standards and it requires intensive and complex computations. Recent computer systems and handheld devices are equipped with high computing capability devices such as a general-purpose graphics processing unit (GPGPU) in addition to the traditional multicores CPU. We develop an optimized parallel implementation of the forward DCT algorithm for the JPEG image compression using the recently proposed Open Computing Language (OpenCL). This OpenCL parallel implementation combines a multicore CPU and a GPGPU in a single solution to perform DCT computations in an efficient manner by applying certain optimization techniques to enhance the kernel execution time and data movements. A separate optimal OpenCL kernel code was developed (CPU-based and GPU-based kernels) based on certain appropriate device-based optimization factors, such as thread-mapping, thread granularity, vector-based memory access, and the given workload. The performance of DCT is evaluated on a heterogeneous environment and our OpenCL parallel implementation results in speeding up the execution of the DCT by the factors of 3.68 and 5.58 for different image sizes and formats in terms of workload allocations and data transfer mechanisms. The obtained speedup indicates the scalability of the DCT performance.
机译:离散余弦变换(DCT)是图像压缩标准中的主要操作之一,它需要大量且复杂的计算。除了传统的多核CPU外,最近的计算机系统和手持设备还配备了诸如通用图形处理单元(GPGPU)之类的高性能计算设备。我们使用最近提出的开放计算语言(OpenCL)为JPEG图像压缩开发了正向DCT算法的优化并行实现。这种OpenCL并行实现在单个解决方案中结合了多核CPU和GPGPU,从而通过应用某些优化技术来提高内核执行时间和数据移动,从而以有效的方式执行DCT计算。基于某些适当的基于设备的优化因素,例如线程映射,线程粒度,基于向量的内存访问和给定的工作负载,开发了单独的最佳OpenCL内核代码(基于CPU和基于GPU的内核)。 DCT的性能是在异构环境上评估的,我们的OpenCL并行实现可根据工作负载分配和数据传输机制,针对不同的图像大小和格式,以3.68和5.58的系数加快DCT的执行速度。获得的加速指示DCT性能的可伸缩性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号