Parallelization of Full Search Motion Estimation Algorithm for Parallel and Distributed Platforms

Eduarda Monteiro; Bruno Vizzotto; Claudio Diniz; Marilena Maule; Bruno Zatt; Sergio Bampi

首页> 外文期刊>International journal of parallel programming >Parallelization of Full Search Motion Estimation Algorithm for Parallel and Distributed Platforms

【24h】

Parallelization of Full Search Motion Estimation Algorithm for Parallel and Distributed Platforms

机译：并行和分布式平台的全搜索运动估计算法的并行化

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This work presents an efficient method to map the Full Search algorithm for Motion Estimation (ME) onto General Purpose Graphic Processing Unit (GPGPU) architectures using Compute Unified Device Architecture (CUDA) programming model. Our method jointly exploits the massive parallelism available in current GPGPU devices and the parallelism potential of Full Search algorithm. Our main goal is to evaluate the feasibility of video codecs implementation using GPGPUs and its advantages and drawbacks compared to other platforms. Therefore, for comparison reasons, three solutions were developed using distinct programming paradigms for distinct underlying hardware architectures: (ⅰ) a sequential solution for general-purpose processor (GPP); (ⅱ) a parallel solution for multi-core GPP using OpenMP library; (ⅲ) a distributed solution for cluster/grid machines using Message Passing Interface (MPI) library. The CUDA-based solution for GPGPUs achieves speed-up compatible to the indicated by the theoretical model for different search areas. Our GPGPU Full Search Motion Estimation provides 2 ×, 20× and 1664× speed-up when compared to MPI, OpenMP and sequential implementations, respectively. Compared to state-of-the-art, our solution reaches up to 17 × speed-up.

机译：这项工作提出了一种有效的方法，可以使用计算统一设备体系结构（CUDA）编程模型将用于运动估计（ME）的完整搜索算法映射到通用图形处理单元（GPGPU）体系结构上。我们的方法共同利用了当前GPGPU设备中可用的大规模并行处理能力和Full Search算法的并行处理潜力。我们的主要目标是评估使用GPGPU实施视频编解码器的可行性以及与其他平台相比的优缺点。因此，出于比较的原因，针对不同的底层硬件架构，使用不同的编程范例开发了三种解决方案：（:)通用处理器（GPP）的顺序解决方案；（ⅱ）使用OpenMP库的多核GPP并行解决方案；（ⅲ）使用消息传递接口（MPI）库的集群/网格计算机的分布式解决方案。基于CUDA的GPGPU解决方案可实现与理论模型所指示的针对不同搜索区域的加速兼容。与MPI，OpenMP和顺序实现相比，我们的GPGPU全搜索运动估计可以分别提高2倍，20倍和1664倍的速度。与最新技术相比，我们的解决方案可将速度提高17倍。

著录项

来源
《International journal of parallel programming》 |2014年第2期|239-264|共26页
作者
Eduarda Monteiro; Bruno Vizzotto; Claudio Diniz; Marilena Maule; Bruno Zatt; Sergio Bampi;
展开▼
作者单位

Informatics Institute, PPGC, PGMICRO, Federal University of Rio Grande do Sul (UFRGS), Porto Alegre, Brazil;

Informatics Institute, PPGC, PGMICRO, Federal University of Rio Grande do Sul (UFRGS), Porto Alegre, Brazil;

Informatics Institute, PPGC, PGMICRO, Federal University of Rio Grande do Sul (UFRGS), Porto Alegre, Brazil;

Informatics Institute, PPGC, PGMICRO, Federal University of Rio Grande do Sul (UFRGS), Porto Alegre, Brazil;

Informatics Institute, PPGC, PGMICRO, Federal University of Rio Grande do Sul (UFRGS), Porto Alegre, Brazil;

Informatics Institute, PPGC, PGMICRO, Federal University of Rio Grande do Sul (UFRGS), Porto Alegre, Brazil;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Motion estimation; Block matching; GPU; CUDA; OpenMP; MPI;

机译：运动估计;块匹配;GPU;CUDA;OpenMP;MPI;

相似文献

外文文献
中文文献
专利

1. Parallel Full Search Algorithm for Motion Estimation on Graphic Processing Unit [J] . Fatma Ezzahra Sayadi, Marwa Chouchene, Haithem Bahri, Recent advances in electrical & electronic engineering . 2019,第4期

机译：图形处理单元运动估计的并行全部搜索算法
2. Hybrid parallel motion estimation architecture based on fast top-winners search algorithm [J] . Yeong-Kang Lai, Lien-Fei Chen, Shien-Yu Huang Consumer Electronics, IEEE Transactions on . 2010,第3期

机译：基于快速优胜者搜索算法的混合并行运动估计架构
3. Parallelization of a regionalization heuristic in distributed computing platforms - a case study of parallel-p-compact-regions problem [J] . Laura Jason, Li Wenwen, Rey Sergio J., International Journal of Geographical Information Science . 2015,第3a4期

机译：分布式计算平台中区域化启发式算法的并行化-并行p-紧凑区域问题的案例研究
4. Parallelizing Machine Learning Optimization Algorithms on Distributed Data-Parallel Platforms with Parameter Server [C] . Rong Gu, Shiqing Fan, Qiu Hu, IEEE International Conference on Parallel and Distributed Systems . 2018

机译：带有参数服务器的分布式数据并行平台上的并行机器学习优化算法
5. Impact of shared memory and distributed memory platforms on the design and performance of parallel evolutionary algorithms. [D] . James, Tabitha Lynn. 2002

机译：共享内存和分布式内存平台对并行进化算法的设计和性能的影响。
6. A Distributed Parallel Genetic Algorithm of Placement Strategy for Virtual Machines Deployment on Cloud Platform [O] . Yu-Shuang Dong, Gao-Chao Xu, Xiao-Dong Fu -1

机译：云平台上虚拟机部署的分布式并行遗传算法
7. Parallel implementation and evaluation of motion estimation system algorithms on a distributed memory multiprocessor using knowledge based mappings [O] . Huang, Thomas S., Choudhary, Alok Nidhi, Patel, Janak H., 1989

机译：使用基于知识的映射在分布式内存多处理器上并行执行和评估运动估计系统算法
8. Parallel implementation and evaluation of motion estimation system algorithms on a distributed memory multiprocessor using knowledge based mappings [R] . Choudhary, Alok Nidhi, Leung, Mun K., Huang, Thomas S., 1989

机译：使用基于知识的映射在分布式存储器多处理器上并行实现和评估运动估计系统算法

Parallelization of Full Search Motion Estimation Algorithm for Parallel and Distributed Platforms

摘要

著录项

相似文献

相关主题

期刊订阅