【24h】

GPU Acceleration of Integer Wavelet Transform for TIFF Image

机译:TIFF图像的整数小波变换的GPU加速

获取原文
获取外文期刊封面目录资料

摘要

In publishing and printing of network version field, there are enormous number of TIFF format (CMYK) images which requires too huge space for storing and enough bandwidth for transmitting. Therefore, common need to manipulate huge amount of data brought about the issue of fast lossless compression. 2D integer wavelet transform can be used for lossless compression of static image, such as, 5/3 lifting wavelet is lossless compression of JEPG2000. Today, Multi-core (dual, four or eight cores) CPU technology help to accelerate wavelet transform speed. However, current multi-more is limit for acceleration. In this article, it presents acceleration of 2D integer wavelet transform by CUDA. Using of the NVIDIA graphics processor unit (GPU), multiple thread parallelization give attractive features than traditional CPU computation. Under the dual cores CPU and the CUDA device, the article accelerates HARR and 5/3 lifting wavelet on TIFF format images. For HARR wavelet, analysis and comparison have been done for original image matrix and matrix of transform result. which indicates adjacent four pixels of original image matrix can directly construct the corresponding four pixels of transform result. In addition, the adjacent four pixels have nothing to do with other pixels of transform result. Therefore, parallel HARR wavelet transform can be achieved by CUDA, the unit of kernel is based on four pixels. For 5/3 lifting wavelet, there are four groups of experiments, each of group have two kinds CUDA memory method(global and texture memory). Therefore, there are eight experiments. Firstly, the kernel uses only row transform and transpose computation by unit of row. Secondly, without transpose, the kernel uses both row and column by unit of row. Thirdly, it also computes row and transpose, however, the transform unit is based on single pixel. At last, it computes row and column without transpose, whose unit is also single pixel. The experiment Experimental results on an NVIDIA GeF--orce 9800GT and an dual cores CPU indicates that the GPU acceleration is obvious with the image resolution increasing whether it is HARR or 5/3 lifting wavelet. For 5/3 lifting wavelet, the second group experiment under texture memory increases about 15 times faster than CPU, and time-consuming decrease 5000ms.
机译:在网络版本领域的出版和印刷中,存在大量的TIFF格式(CMYK)图像,这需要太大的存储空间和足够的传输带宽。因此,通常需要处理大量数据,带来了快速无损压缩的问题。二维整数小波变换可用于静态图像的无损压缩,例如5/3提升小波是JEPG2000的无损压缩。如今,多核(双核,四核或八核)CPU技术有助于加快小波变换速度。但是,当前的倍数限制了加速。在本文中,它介绍了CUDA对2D整数小波变换的加速。使用NVIDIA图形处理器单元(GPU),多线程并行化提供了比传统CPU计算更吸引人的功能。在双核CPU和CUDA设备下,本文加速了TIFF格式图像上的HARR和5/3提升小波。对于HARR小波,已经对原始图像矩阵和变换结果矩阵进行了分析和比较。表示原始图像矩阵的相邻四个像素可以直接构造对应的四个像素的转换结果。另外,相邻的四个像素与变换结果的其他像素无关。因此,CUDA可以实现并行的HARR小波变换,内核的单位基于四个像素。对于5/3提升小波,有四组实验,每组有两种CUDA记忆方法(全局记忆和纹理记忆)。因此,有八个实验。首先,内核仅使用行变换并按行单位进行转置计算。其次,在不进行转置的情况下,内核会按行单位使用行和列。第三,它还计算行和转置,但是,变换单元基于单个像素。最后,它计算不进行转置的行和列,其单位也是单个像素。实验在NVIDIA GeF-上的实验结果 -- orce 9800GT和双核CPU表示无论是HARR还是5/3提升小波,GPU的加速都非常明显,并且图像分辨率也有所提高。对于5/3提升小波,纹理内存下的第二组实验比CPU增加大约15倍,耗时减少5000ms。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号