...
首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Design and Evaluation of Network-Levitated Merge for Hadoop Acceleration
【24h】

Design and Evaluation of Network-Levitated Merge for Hadoop Acceleration

机译:Hadoop加速的网络悬浮合并的设计和评估

获取原文
获取原文并翻译 | 示例

摘要

Hadoop is a popular open source implementation of the MapReduce programming model for cloud computing. However, it faces a number of issues to achieve the best performance from the underlying systems. These include a serialization barrier that delays the reduce phase, repetitive merges, and disk accesses, and the lack of portability to different interconnects. To keep up with the increasing volume of data sets, Hadoop also requires efficient I/O capability from the underlying computer systems to process and analyze data. We describe Hadoop-A, an acceleration framework that optimizes Hadoop with plug-in components for fast data movement, overcoming the existing limitations. A novel network-levitated merge algorithm is introduced to merge data without repetition and disk access. In addition, a full pipeline is designed to overlap the shuffle, merge, and reduce phases. Our experimental results show that Hadoop-A significantly speeds up data movement in MapReduce and doubles the throughput of Hadoop. In addition, Hadoop-A significantly reduces disk accesses caused by intermediate data.
机译:Hadoop是用于云计算的MapReduce编程模型的流行开源实现。但是,要从基础系统中获得最佳性能,将面临许多问题。这些包括延迟延迟阶段,重复合并和磁盘访问的序列化障碍,以及对不同互连的可移植性不足。为了跟上不断增长的数据集数量,Hadoop还需要底层计算机系统的有效I / O功能来处理和分析数据。我们描述了Hadoop-A,这是一个加速框架,它使用插件组件优化Hadoop以实现快速数据移动,克服了现有限制。引入了一种新颖的网络悬浮合并算法,无需重复和磁盘访问即可合并数据。另外,完整的流水线被设计为与改组,合并和减少阶段重叠。我们的实验结果表明,Hadoop-A显着加快了MapReduce中的数据移动,并使Hadoop的吞吐量增加了一倍。此外,Hadoop-A大大减少了由中间数据引起的磁盘访问。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号