首页> 外文会议>IEEE Symposium on Computers and Communications >Multi-job Hadoop scheduling to process Geo-distributed big data
【24h】

Multi-job Hadoop scheduling to process Geo-distributed big data

机译:多作业Hadoop调度处理地理分布式大数据

获取原文

摘要

Effective big data analysis is one of the most notable research challenge of the latest few years. Hadoop, the most popular implementation of the MapReduce framework, has today become widespread used for processing large data sets using cloud resources. However, in many scenarios, data are geographically distributed over data centers and moving them to a single site for processing may result extremely expensive when not feasible at all. A key challenge for running applications in such a geographically distributed environment is how to efficiently schedule the computation over the different datacenters. In this work we present a job scheduler for a Hierarchical Hadoop Framework (H2F) that allows the management of multiple requests of job execution ensuring an efficient use of the available resources. Our experimental evaluations show that using H2F significantly improves processing time for geodistributed data sets with respect to a plain Hadoop system.
机译:有效的大数据分析是最近几年的最值得注意的研究挑战之一。 Hadoop是MapReduce框架最受欢迎的实现,今天已成为使用云资源处理大数据集的广泛。然而,在许多情况下,数据在地理上分布在数据中心上,并将它们移动到一个站点以进行处理,可能导致在不可行的情况下非常昂贵。在这种地理上分布的环境中运行应用程序的关键挑战是如何有效地将计算计划在不同的数据中心上。在这项工作中,我们为分层Hadoop框架(H2F)提供了一个作业调度程序,其允许管理多个作业请求,确保有效地使用可用资源。我们的实验评估表明,使用H2F显着提高了与普通Hadoop系统相对于普通Hadoop系统的地理分布式数据集的处理时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号