首页> 外文会议>Database systems for advanced applications >HadoopM: A Message-Enabled Data Processing System on Large Clusters
【24h】

HadoopM: A Message-Enabled Data Processing System on Large Clusters

机译:HadoopM:大型集群上启用消息的数据处理系统

获取原文
获取原文并翻译 | 示例

摘要

MapReduce as a popular platform for solving embarrassingly parallel problems has been extensively used on large commodity clusters. However constrained by embarrassingly parallel assumption, some computation patterns are not easy to express in MapReduce, and in some cases performance and efficiency can not be achieved without communication between tasks, such as iteration and map phase filtration from a holistic perspective. This paper presents HadoopM, a message-enhanced version of Hadoop MapReduce architecture that it breaks the key embarrassingly parallel assumption and can execute the MR jobs in a more efficient and elegant way. HadoopM allows user-defined message to be passed between mappers or reducers by two message passing mechanisms: lightweight and heavyweight, and asynchronous and synchronous message passing are both supported by system. HadoopM retains the scalability and fault-tolerance of Hadoop and is binary compatible with Hadoop Mapreduce. Our experimental results demonstrate the superiority of modified version over original Hadoop MapReduce on a range of algorithms. In some cases, such as PageRank and Skyline, HadoopM significantly boosts the job performance up to 50%.
机译:MapReduce作为解决令人尴尬的并行问题的流行平台已广泛用于大型商品集群。但是,在令人尴尬的并行假设的约束下,某些计算模式在MapReduce中不易表达,并且在某些情况下,如果没有任务之间的沟通,例如从整体角度来看迭代和地图阶段过滤,就无法实现性能和效率。本文介绍了HadoopM,这是Hadoop MapReduce架构的消息增强版本,它打破了令人尴尬的关键并行假设,并且可以以更有效,更优雅的方式执行MR作业。 HadoopM允许用户定义的消息通过两种消息传递机制在映射器或化简器之间传递:轻量级和重量级,异步和同步消息传递均受系统支持。 HadoopM保留了Hadoop的可伸缩性和容错能力,并且与Hadoop Mapreduce二进制兼容。我们的实验结果表明,在多种算法上,修改版本优于原始Hadoop MapReduce。在某些情况下,例如PageRank和Skyline,HadoopM可将工作绩效显着提高50%。

著录项

  • 来源
  • 会议地点 Bali(ID)
  • 作者单位

    School of Computer Science and Technology, Northwestern Polytechnical University, Xi'an 710072, China,Guangdong Key Laboratory of Popular High Performance Computers, Shenzhen Key Laboratory of Service Computing and Applications, Shen'zhen 518060, China;

    School of Computer Science and Technology, Northwestern Polytechnical University, Xi'an 710072, China,Guangdong Key Laboratory of Popular High Performance Computers, Shenzhen Key Laboratory of Service Computing and Applications, Shen'zhen 518060, China;

    School of Computer Science and Technology, Northwestern Polytechnical University, Xi'an 710072, China,Guangdong Key Laboratory of Popular High Performance Computers, Shenzhen Key Laboratory of Service Computing and Applications, Shen'zhen 518060, China;

    School of Computer Science and Technology, Northwestern Polytechnical University, Xi'an 710072, China,Guangdong Key Laboratory of Popular High Performance Computers, Shenzhen Key Laboratory of Service Computing and Applications, Shen'zhen 518060, China;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号