Retinex is an image restoration method and the center/surround Retinex is appropriate for parallelization because it utilizes a convolution operation with large kernel size to achieve dynamic range compression and color/lightness rendition. However, its great capability for image enhancement comes with intensive computation. This paper presents a GPURetinex, which is a data parallel algorithm based on GPGPU/CUDA. The GPURetinex exploits GPGPU's massively parallel architecture and hierarchical memory to improve efficiency. The GPURetinex has been further improved by optimizing the memory usage and out-of-boundary extrapolation in the convolution step. In our experiments, the GPURetinex can gain 72 times speedup compared with the optimized single-threaded CPU implementation by OpenCV for the images with 2048 × 2048 resolution. The proposed method also outperforms a Retinex implementation based on the NPP (nVidia Performance Primitives).
展开▼