Compression techniques are widely used in data storage and transmission. However, due to the inherent sequential nature, most existing dictionary-based compression/decompression algorithms are designed for sequential execution on CPUs. To explore the potential performance improvements of compression and decompression processes using graphic processing unit (GPU), by investigating the techniques of coalescing memory access and parallel assem-bling, this paper studies two parallel implementations of dictionary-based techniques based on CUDA (compute unified device architecture), stateless compression/decompression and LZW compression/decompression. The experimental results demonstrate that, compared with traditional sequential implementations based on single core, the two pro-posed approaches can improve the performance of existing sequential dictionary-based compression/decompression algorithms drastically.%压缩技术被广泛应用于数据存储和传输中,然而由于其内在的串行特性,大多数已有的基于字典的压缩与解压缩算法被设计在CPU上串行执行。为了探究使用图形处理器(graphic processing unit,GPU)对压缩与解压缩过程潜在性能的提升,结合合并内存访问与并行组装的技术,基于CUDA(compute unified device archi-tecture)平台研究了两种并行压缩与解压缩方法:基于字典的无状态压缩和基于字典的LZW压缩。实验结果表明,与传统的单核实现比较,所提方法能够显著改善已有的基于字典的串行压缩与解压缩算法的性能。
展开▼