...
首页> 外文期刊>ACM Transactions on Architecture and Code Optimization >Tiled-MapReduce: Efficient and Flexible MapReduce Processing on Multicore with Tiling
【24h】

Tiled-MapReduce: Efficient and Flexible MapReduce Processing on Multicore with Tiling

机译:Tiled-MapReduce:在多核平铺上高效灵活的MapReduce处理

获取原文
获取原文并翻译 | 示例
           

摘要

The prevalence of chip multiprocessors opens opportunities of running data-parallel applications originally in clusters on a single machine with many cores. MapEeduce, a simple and elegant programming model to program large-scale clusters, has recently been shown a promising alternative to harness the multicore platform. The differences such as memory hierarchy and communication patterns between clusters and multicore platforms raise new challenges to design and implement an efficient MapReduce system on multicore. This article argues that it is more efficient for MapReduce to iteratively process small chunks of data in turn than processing a large chunk of data at a time on shared memory multicore platforms. Based on the argument, we extend the general MapReduce programming model with a "tiling strategy", called Tiled-MapReduce (TMR). TMR partitions a large MapReduce job into a number of small subjobs and iteratively processes one subjob at a time with efficient use of resources; TMR finally merges the results of all subjobs for output. Based on Tiled-MapReduce, we design and implement several optimizing techniques targeting multicore, including the reuse of the input buffer among subjobs, a NUCA/NUMA-aware scheduler, and pipelining a subjob's reduce phase with the successive subjob's map phase, to optimize the memory, cache, and CPU resources accordingly. Further, we demonstrate that Tiled-MapReduce supports fine-grained fault tolerance and enables several usage scenarios such as online and incremental computing on multicore machines. Performance evaluation with our prototype system called Ostrich on a 48-core machine shows that Ostrich saves up to 87.6% memory, causes less cache misses, and makes more efficient use of CPU cores, resulting in a speedup ranging from 1.86x to 3.07x over Phoenix. Ostrich also efficiently supports fine-grained fault tolerance, online, and incremental computing with small performance penalty.
机译:芯片多处理器的普及为最初在具有多个内核的单台计算机上的集群中运行数据并行应用程序提供了机会。 MapEeduce是一种用于对大型集群进行编程的简单而优雅的编程模型,最近被证明可以利用多核平台。集群与多核平台之间的内存层次结构和通信模式等差异对在多核上设计和实现高效的MapReduce系统提出了新的挑战。本文认为,与在共享内存多核平台上一次处理大块数据相比,MapReduce依次迭代处理小块数据更有效。基于该论点,我们使用称为“ Tiled-MapReduce(TMR)”的“平铺策略”扩展了通用MapReduce编程模型。 TMR将一个大型MapReduce作业划分为多个小子作业,并一次有效地利用资源来迭代处理一个子作业。 TMR最终合并所有子作业的结果以进行输出。我们基于Tiled-MapReduce设计并实现了针对多核的几种优化技术,包括子作业之间输入缓冲区的重用,支持NUCA / NUMA的调度程序以及将子作业的reduce阶段与后续子作业的map阶段进行流水线化处理,以优化内存,缓存和CPU资源。此外,我们证明了Tiled-MapReduce支持细粒度的容错能力,并支持多种使用场景,例如多核计算机上的联机和增量计算。在48核机器上使用称为Ostrich的原型系统进行的性能评估表明,Ostrich可以节省多达87.6%的内存,减少高速缓存未命中的次数,并更有效地利用CPU内核,从而使速度从1.86倍提高到3.07倍凤凰。 Ostrich还有效地支持细粒度的容错,在线和增量计算,而对性能的影响很小。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号