MRO-MPI: MapReduce overlapping using MPI and an optimized data exchange policy

Hisham Mohamed; Stephane Marchand-Maillet

首页> 外文期刊>Parallel Computing >MRO-MPI: MapReduce overlapping using MPI and an optimized data exchange policy

【24h】

MRO-MPI: MapReduce overlapping using MPI and an optimized data exchange policy

机译：MRO-MPI：使用MPI和优化的数据交换策略进行MapReduce重叠

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

MapReduce is a programming model proposed to simplify large-scale data processing. In contrast, the message passing interface (MPI) standard is extensively used for algorithmic parallelization, as it accommodates an efficient communication infrastructure. In the original implementation of MapReduce, the reduce function can only start processing following termination of the map function. If the map function is slow for any reason, this will affect the whole running time. In this paper, we propose MapReduce overlapping using MPI, which is an adapted structure of the MapReduce programming model for fast intensive data processing. Our implementation is based on running the map and the reduce functions concurrently in parallel by exchanging partial intermediate data between them in a pipeline fashion using MPI. At the same time, we maintain the usability and the simplicity of MapReduce. Experimental results based on three different applications (WordCount, Distributed Inverted Indexing and Distributed Approximate Similarity Search) show a good speedup compared to the earlier versions of MapReduce such as Hadoop and the available MPI-MapReduce implementations.

机译：MapReduce是一种旨在简化大规模数据处理的编程模型。相比之下，消息传递接口（MPI）标准被广泛用于算法并行化，因为它可以容纳有效的通信基础结构。在MapReduce的原始实现中，reduce函数只能在map函数终止后开始处理。如果地图功能由于任何原因而变慢，则将影响整个运行时间。在本文中，我们提出了使用MPI进行MapReduce重叠的方法，该方法是MapReduce编程模型的一种改编结构，用于快速密集数据处理。我们的实现基于并行运行地图和reduce函数，方法是使用MPI以管道方式在它们之间交换部分中间数据。同时，我们保持MapReduce的可用性和简单性。与早期版本的MapReduce（例如Hadoop）和可用的MPI-MapReduce实现相比，基于三个不同应用程序（WordCount，分布式反向索引和分布式近似相似性搜索）的实验结果显示出良好的加速效果。

著录项

来源
《Parallel Computing》 |2013年第12期|851-866|共16页
作者
Hisham Mohamed; Stephane Marchand-Maillet;
展开▼
作者单位

Viper Croup, Computer Vision and Multimedia Laboratory, University of Geneva, 7 Route de Drize, Geneva, Switzerland;

Viper Croup, Computer Vision and Multimedia Laboratory, University of Geneva, 7 Route de Drize, Geneva, Switzerland;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
MapReduce overlapping; MPI-MapReduce; Parallel MapReduce; Big data; Large scale data processing;

机译：MapReduce重叠;MPI-MapReduce;并行MapReduce;大数据;大规模数据处理;

相似文献

外文文献
中文文献
专利

1. A study of the influence of VM allocation policies on MPI Bcast and MPI Exchange latency in cloud [J] . Fernando Gomez Folgar, Guillermo Indalecio Fernandez, Jose Isaac Zablah Avila, Latin America Transactions, IEEE (Revista IEEE America Latina) . 2017,第8期

机译：VM分配策略对云中MPI Bcast和MPI Exchange延迟的影响研究
2. A Study Of The Influence Of VM Allocation Policies On MPI Bcast And MPI Exchange Latency In Cloud [J] . Gomez F., Indalecio G., Zablah J. I., Limnology and oceanography, methods . 2017,第8期

机译：VM分配策略对云中MPI BCAST和MPI交换延迟的影响研究
3. An Improved Algorithm for Optimizing MapReduce Based on Locality and Overlapping [J] . Jianjiang Li, Jie Wang, Bin Lyu, 清华大学学报（英文版） . 2018,第006期

机译：基于局部性和重叠性的MapReduce优化算法的改进
4. Enhancing MapReduce Using MPI and an Optimized Data Exchange Policy [C] . Mohamed Hisham, Marchand-Maillet Stephane 41st International Conference on Parallel Processing Workshops. . 2012

机译：使用MPI和优化的数据交换策略增强MapReduce
5. Overlapping computation and communication through offloading in MPI over InfiniBand. [D] . Inozemtsev, Grigori. 2014

机译：通过在InfiniBand上卸载MPI，可以使计算和通信重叠。
6. Big Data: A Parallel Particle Swarm Optimization-Back-Propagation Neural Network Algorithm Based on MapReduce [O] . Jianfang Cao, Hongyan Cui, Hao Shi, -1

机译：大数据：基于MapReduce的并行粒子群优化-反向传播神经网络算法
7. To Overlap or Not to Overlap: Optimizing Incremental MapReduce Computations for On-Demand Data Upload [O] . Ene, Stefan, Nicolae, Bogdan, Costan, Alexandru, 2014

机译：重叠还是不重叠：针对按需数据上传优化增量MapReduce计算

MRO-MPI: MapReduce overlapping using MPI and an optimized data exchange policy

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅