...
首页> 外文期刊>International Journal of Distributed Sensor Networks >Dynamic Deduplication Decision in a Hadoop Distributed File System
【24h】

Dynamic Deduplication Decision in a Hadoop Distributed File System

机译:Hadoop分布式文件系统中的动态重复数据删除决策

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Data are generated and updated tremendously fast by users through any devices in anytime and anywhere in big data. Coping with these multiform data in real time is a heavy challenge. Hadoop distributed file system (HDFS) is designed to deal with data for building a distributed data center. HDFS uses the data duplicates to increase data reliability. However, data duplicates need a lot of extra storage space and funding in infrastructure. Using the deduplication technique can improve utilization of the storage space effectively. In this paper, we propose a dynamic deduplication decision to improve the storage utilization of a data center which uses HDFS as its file system. Our proposed system can formulate a proper deduplication strategy to sufficiently utilize the storage space under the limited storage devices. Our deduplication strategy deletes useless duplicates to increase the storage space. The experimental results show that our method can efficiently improve the storage utilization of a data center using the HDFS system.
机译:用户可以随时随地通过任何设备以大数据快速,快速地生成和更新数据。实时处理这些多格式数据是一项严峻的挑战。 Hadoop分布式文件系统(HDFS)旨在处理数据以构建分布式数据中心。 HDFS使用数据重复项来提高数据可靠性。但是,数据重复需要大量额外的存储空间和基础架构方面的资金。使用重复数据删除技术可以有效提高存储空间的利用率。在本文中,我们提出了动态重复数据删除决策,以提高使用HDFS作为其文件系统的数据中心的存储利用率。我们提出的系统可以制定适当的重复数据删除策略,以充分利用有限存储设备下的存储空间。我们的重复数据删除策略会删除无用的重复数据,以增加存储空间。实验结果表明,我们的方法可以使用HDFS系统有效地提高数据中心的存储利用率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号