A Novel Minimum Time Parallel 2-D Discrete Wavelet Transform Algorithm for General Purpose Processors

机译：一种新的通用处理器的最小时间并行二维离散小波变换算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

A novel efficient inplace, multithreaded, and cache-friendly parallel 2-D wavelet transform algorithm based on the lifting transform is introduced. In order to maximize the cache utilization and consequently minimize the memory bus bandwidth use, the threads compete to work on a small memory area maximizing the chance of finding it in the cache and their synchronization is done with very low overhead without the use of any locks and relying solely on the basic compare-and-swap (CAS) atomic primitive. An implementation in the C programming language with and without the use of vector (single instruction multiple data - SIMD) instructions is provided for both single (serial) and multi (parallel) threaded single-loop DWT implementations as well as serial and parallel naive implementations using linear (row order) and strided (column order) memory access patterns for comparison. Results show a significant improvement over the single-threaded optimized implementation and a much greater improvement over both the single and multi threaded naive implementations, reaching minimum running time depending on the number of processor cores and the available memory bus bandwidth, i.e., it becomes memory bound using the minimum number of memory accesses. Given the simplicity and high speed of the lifting steps, an analysis based on the number of memory bus operations (read and write) is done for images that are larger than twice the shared cache size which establishes a lower bound for the running time of all linear memory access algorithms and also determines the maximum speed gains to be expected in relation to currently implemented parallel schemes based on the parallel execution of independent lifting steps. It also shows the optimality of the parallel algorithm presented. Finally, a comparison with currently available implementations shows the gains achieved by the proposed algorithm.

机译：介绍了一种基于升降变换的新型高效的ILPLACE，多线程和高速缓冲友好的并联2-D小波变换算法。为了最大化高速缓存利用率，并因此最小化存储器总线带宽使用，线程竞争为在小的内存区域上工作，最大化在高速缓存中找到它的机会，并且在不使用任何锁的情况下使用非常低的开销进行它们的同步并仅依赖于基本比较和交换（CAS）原子原语。为单个（串行）和多（并行）螺纹单环DWT实现以及串行和并行Naive实现提供了C编程语言的C编程语言（单指令多数据 - SIMD）指令的实施使用线性（行顺序）和string（列顺序）存储器访问模式进行比较。结果显示对单线程优化实现的显着改进，以及对单个和多线程天真的实现的更大改进，取决于处理器内核的数量和可用的内存总线带宽，即它变为内存使用最小内存访问数量的绑定。鉴于提升步骤的简单性和高速，基于存储总线操作数（读写）的分析是为大于共享高速缓存大小的两倍的图像完成的，该图像为所有的运行时间建立下限线性存储器访问算法并根据基于独立提升步骤的并行执行，确定关于当前实现的并行方案的最大速度增益。它还显示了所呈现的并行算法的最优性。最后，与目前可用的实现的比较显示了所提出的算法实现的增益。

著录项

来源
《International Conference on Parallel Processing》|2017年|603p|共10页
会议地点
作者
Eduardo Moscoso Rubino; Alberto Jose Alvares; Raul Marin Prades; Pedro Sanz Valero;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.133.2-53;
关键词

相似文献

外文文献
中文文献
专利

1. A parallel implementation of the 2-D discrete wavelet transform without interprocessor communications [J] . Marino F., Piuri V. IEEE Transactions on Signal Processing . 1999,第11期

机译：无需处理器间通信的二维离散小波变换的并行实现
2. VLSI implementation of 2-D discrete wavelet transform for real-time video signal processing [J] . Chu Yu, Sao-Jie Chen IEEE Transactions on Consumer Electronics . 1997,第4期

机译：二维离散小波变换的VLSI实现，用于实时视频信号处理
3. A Fast Cross-Range Scaling Algorithm for ISAR Images Based on the 2-D Discrete Wavelet Transform and Pseudopolar Fourier Transform [J] . Dong Li, Chengxiang Zhang, Hongqing Liu, IEEE Transactions on Geoscience and Remote Sensing . 2019,第7期

机译：基于二维离散小波变换和伪极傅里叶变换的ISAR图像快速跨域缩放算法
4. A Novel Minimum Time Parallel 2-D Discrete Wavelet Transform Algorithm for General Purpose Processors [C] . Eduardo Moscoso Rubino, Alberto Jose Alvares, Raul Marin Prades, International Conference on Parallel Processing . 2017

机译：一种新的通用处理器的最小时间并行二维离散小波变换算法
5. Synthesis and development of image coding algorithms based on 2D discrete wavelet transform. [D] . Yok Wooi, Matthew Teow. 2011

机译：基于二维离散小波变换的图像编码算法的综合与发展。
6. An Analog Circuit Approximation of the Discrete Wavelet Transform for Ultra Low Power Signal Processing in Wearable Sensor Nodes [O] . Alexander J. Casson 2015

机译：穿戴式传感器节点中超低功率信号处理的离散小波变换的模拟电路近似
7. 2-D wavelet transform enhancement on general-purpose microprocessors: memory hierarchy and SIMD parallelism exploitation [O] . Chaver Martínez Daniel Ángel, Tenllado van der Reijden Christian, Piñuel Moreno Luis, 2002

机译：通用微处理器上的二维小波变换增强：存储器层次结构和SIMD并行性开发

A Novel Minimum Time Parallel 2-D Discrete Wavelet Transform Algorithm for General Purpose Processors

摘要

著录项

相似文献

相关主题

期刊订阅