首页> 外文会议>IEEE International Conference on Cloud Computing >Performance Measurement on Scale-Up and Scale-Out Hadoop with Remote and Local File Systems
【24h】

Performance Measurement on Scale-Up and Scale-Out Hadoop with Remote and Local File Systems

机译:使用远程和本地文件系统的Hadoop向上扩展和向外扩展性能评估

获取原文

摘要

MapReduce is a popular computing model for parallel data processing on large-scale datasets, which can vary from gigabytes to terabytes and petabytes. Though Hadoop MapReduce normally uses Hadoop Distributed File System (HDFS) local file system, it can be configured to use a remote file system. Then, an interesting question is raised: for a given application, which is the best running platform among the different combinations of scale-up and scale-out Hadoop with remote and local file systems. However, there has been no previous research on how different types of applications (e.g., CPU-intensive, data-intensive) with different characteristics (e.g., input data size) can benefit from the different platforms. Thus, in this paper, we conduct a comprehensive performance measurement of different applications on scale-up and scaleout clusters configured with HDFS and a remote file system (i.e., OFS), respectively. We identify and study how different job characteristics (e.g., input data size, the number of file reads/writes, and the amount of computations) affect the performance of different applications on the different platforms. This study is expected to provide a guidance for users to choose the best platform to run different applications with different characteristics in the environment that provides both remote and local storage, such as HPC cluster.
机译:MapReduce是一种流行的计算模型,用于大规模数据集上的并行数据处理,其大小可能从千兆字节到TB到PB级不等。尽管Hadoop MapReduce通常使用Hadoop分布式文件系统(HDFS)本地文件系统,但可以将其配置为使用远程文件系统。然后,提出了一个有趣的问题:对于给定的应用程序,这是横向扩展和横向扩展Hadoop与远程和本地文件系统的不同组合中运行最好的平台。但是,以前没有关于具有不同特征(例如输入数据大小)的不同类型的应用程序(例如CPU密集型,数据密集型)如何从不同平台中受益的研究。因此,在本文中,我们分别对配置有HDFS和远程文件系统(即OFS)的按比例扩展和按比例扩展群集上的不同应用程序进行综合性能评估。我们确定并研究不同的工作特征(例如输入数据大小,文件读/写数量和计算量)如何影响不同平台上不同应用程序的性能。预期该研究将为用户选择最佳平台,以在提供远程和本地存储(例如HPC群集)的环境中运行具有不同特征的不同应用程序提供指导。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号