首页> 外文期刊>Future generation computer systems >MARIANE: Using MApReduce in HPC environments
【24h】

MARIANE: Using MApReduce in HPC environments

机译:MARIANE:在HPC环境中使用MApReduce

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

MapReduce is increasingly becoming a popular programming model. However, the widely used implementation, Apache Hadoop, uses the Hadoop Distributed File System (HDFS), which is currently not directly applicable to a majority of existing HPC environments such as Teragrid and NERSC that support other distributed file systems. On such resourceful High Performance Computing (HPC) infrastructures, the MapReduce model can rarely make use of full resources, as special circumstances must be created for its adoption, or simply limited resources must be isolated to the same end. This paper not only presents a MapReduce implementation directly suitable for such environments, but also exposes the design choices for better performance gains in those settings. By leveraging inherent distributed file systems' functions, and abstracting them away from its MapReduce framework, MARIANE (MApReduce Implementation Adapted for HPC Environments) not only allows for the use of the model in an expanding number of HPC environments, but also shows better performance in such settings. This paper identifies the components and trade-offs necessary for this model, and quantifies the performance gains exhibited by our approach in HPC environments over Apache Hadoop in a data intensive setting at the National Energy Research Scientific Computing Center (NERSC).
机译:MapReduce越来越成为一种流行的编程模型。但是,被广泛使用的实现Apache Hadoop使用Hadoop分布式文件系统(HDFS),当前它不能直接应用于支持其他分布式文件系统的大多数现有HPC环境,例如Teragrid和NERSC。在这种资源丰富的高性能计算(HPC)基础架构上,MapReduce模型很少使用全部资源,因为必须创建特殊条件以使其采用,或者必须将有限的资源隔离到同一端。本文不仅介绍了直接适用于此类环境的MapReduce实现,而且还介绍了在这些设置中获得更好性能的设计选择。通过利用固有的分布式文件系统功能,并将其从其MapReduce框架中抽象出来,MARIANE(适用于HPC环境的MApReduce实现)不仅可以在越来越多的HPC环境中使用该模型,而且在这样的设置。本文确定了该模型所需的组件和权衡,并量化了我们在国家能源研究科学计算中心(NERSC)上的数据密集型环境下,我们的方法在基于Apache Hadoop的HPC环境中所展现的性能提升。

著录项

  • 来源
    《Future generation computer systems》 |2014年第7期|379-388|共10页
  • 作者单位

    Grid and Cloud Computing Research Laboratory, Department of Computer Science, State University of New York (SUNY) at Binghamton, Vestal, NY 13902, United States;

    Grid and Cloud Computing Research Laboratory, Department of Computer Science, State University of New York (SUNY) at Binghamton, Vestal, NY 13902, United States;

    Grid and Cloud Computing Research Laboratory, Department of Computer Science, State University of New York (SUNY) at Binghamton, Vestal, NY 13902, United States;

    Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, United States;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Hadoop; MapReduce; Data intensive; Scientific computing;

    机译:Hadoop;MapReduce;数据密集型;科学计算;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号