首页> 外文期刊>International Journal of Information Technology and Computer Science >Delay Scheduling Based Replication Scheme for Hadoop Distributed File System
【24h】

Delay Scheduling Based Replication Scheme for Hadoop Distributed File System

机译:Hadoop分布式文件系统的基于延迟调度的复制方案

获取原文
           

摘要

The data generated and processed by modern computing systems burgeon rapidly. MapReduce is an important programming model for large scale data intensive applications. Hadoop is a popular open source implementation of MapReduce and Google File System (GFS). The scalability and fault-tolerance feature of Hadoop makes it as a standard for BigData processing. Hadoop uses Hadoop Distributed File System (HDFS) for storing data. Data reliability and fault-tolerance is achieved through replication in HDFS. In this paper, a new technique called Delay Scheduling Based Replication Algorithm (DSBRA) is proposed to identify and replicate (dereplicate) the popular (unpopular) files/blocks in HDFS based on the information collected from the scheduler. Experimental results show that, the proposed method achieves 13% and 7% improvements in response time and locality over existing algorithms respectively.
机译:由现代计算系统迅速生成和处理的数据。 MapReduce是用于大规模数据密集型应用程序的重要编程模型。 Hadoop是MapReduce和Google File System(GFS)的流行开源实现。 Hadoop的可伸缩性和容错功能使其成为BigData处理的标准。 Hadoop使用Hadoop分布式文件系统(HDFS)来存储数据。数据可靠性和容错能力是通过HDFS中的复制实现的。在本文中,提出了一种称为基于延迟调度的复制算法(DSBRA)的新技术,用于基于从调度程序收集的信息来识别和复制(复制)HDFS中的流行(不受欢迎)文件/块。实验结果表明,与现有算法相比,该方法的响应时间和局部性分别提高了13%和7%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号