首页> 外文期刊>Parallel Computing >MRO-MPI: MapReduce overlapping using MPI and an optimized data exchange policy
【24h】

MRO-MPI: MapReduce overlapping using MPI and an optimized data exchange policy

机译:MRO-MPI:使用MPI和优化的数据交换策略进行MapReduce重叠

获取原文
获取原文并翻译 | 示例

摘要

MapReduce is a programming model proposed to simplify large-scale data processing. In contrast, the message passing interface (MPI) standard is extensively used for algorithmic parallelization, as it accommodates an efficient communication infrastructure. In the original implementation of MapReduce, the reduce function can only start processing following termination of the map function. If the map function is slow for any reason, this will affect the whole running time. In this paper, we propose MapReduce overlapping using MPI, which is an adapted structure of the MapReduce programming model for fast intensive data processing. Our implementation is based on running the map and the reduce functions concurrently in parallel by exchanging partial intermediate data between them in a pipeline fashion using MPI. At the same time, we maintain the usability and the simplicity of MapReduce. Experimental results based on three different applications (WordCount, Distributed Inverted Indexing and Distributed Approximate Similarity Search) show a good speedup compared to the earlier versions of MapReduce such as Hadoop and the available MPI-MapReduce implementations.
机译:MapReduce是一种旨在简化大规模数据处理的编程模型。相比之下,消息传递接口(MPI)标准被广泛用于算法并行化,因为它可以容纳有效的通信基础结构。在MapReduce的原始实现中,reduce函数只能在map函数终止后开始处理。如果地图功能由于任何原因而变慢,则将影响整个运行时间。在本文中,我们提出了使用MPI进行MapReduce重叠的方法,该方法是MapReduce编程模型的一种改编结构,用于快速密集数据处理。我们的实现基于并行运行地图和reduce函数,方法是使用MPI以管道方式在它们之间交换部分中间数据。同时,我们保持MapReduce的可用性和简单性。与早期版本的MapReduce(例如Hadoop)和可用的MPI-MapReduce实现相比,基于三个不同应用程序(WordCount,分布式反向索引和分布式近似相似性搜索)的实验结果显示出良好的加速效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号