首页> 外文期刊>Technical Gazette >Data Deduplication Technology for Cloud Storage
【24h】

Data Deduplication Technology for Cloud Storage

机译:云存储数据重复数据删除技术

获取原文
获取外文期刊封面目录资料

摘要

With the explosive growth of information data, the data storage system has stepped into the cloud storage era. Although the core of the cloud storage system is distributed file system in solving the problem of mass data storage, a large number of duplicate data exist in all storage system. File systems are designed to control how files are stored and retrieved. Fewer studies focus on the cloud file system deduplication technologies at the application level, especially for the Hadoop distributed file system. In this paper, we design a file deduplication framework on Hadoop distributed file system for cloud application developer. Proposed RFD-HDFS and FD-HDFS two data deduplication solutions process data deduplication online, which improves storage space utilisation and reduces the redundancy. In the end of the paper, we test the disk utilisation and the file upload performance on RFD-HDFS and FD-HDFS, and compare HDFS with the disk utilisation of two system frameworks. The results show that the two-system framework not only implements data deduplication function but also effectively reduces the disk utilisation of duplicate files. So, the proposed framework can indeed reduce the storage space by eliminating redundant HDFS file.
机译:随着信息数据的爆炸性增长,数据存储系统已经进入了云存储时代。虽然云存储系统的核心是在解决质量数据存储问题时的分布式文件系统,但所有存储系统中都存在大量重复数据。文件系统旨在控制文件如何存储和检索。更少的研究侧重于应用程序级别的云文件系统重复数据删除技术,特别是对于Hadoop分布式文件系统。在本文中,我们在Hadoop分布式文件系统上设计了一个文件重复数据删除框架,用于云应用程序开发人员。提出的RFD-HDFS和FD-HDFS两次数据重复数据删除解决方案在线重复数据删除,从而提高了存储空间利用率并降低了冗余。在纸张结束时,我们在RFD-HDFS和FD-HDFS上测试磁盘利用率和文件上载性能,并比较HDFS与两个系统框架的磁盘利用率。结果表明,两种系统框架不仅实现数据重复数据删除功能,还可以有效地减少重复文件的磁盘利用率。因此,所提出的框架可以通过消除冗余HDFS文件来确实可以减少存储空间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号