首页> 外文会议>International Conference on Parallel Processing >Hysteresis Re-chunking Based Metadata Harnessing Deduplication of Disk Images
【24h】

Hysteresis Re-chunking Based Metadata Harnessing Deduplication of Disk Images

机译:基于磁滞重分块的元数据利用磁盘映像重复数据删除

获取原文

摘要

Metadata-related overhead can significantly impact the performance of data deduplication systems, including the real duplication elimination ratio and the deduplication throughput. The amount of metadata produced is mainly determined by the chunking mechanism for the input data stream. In this paper, we propose a metadata harnessing deduplication (MHD) algorithm utilizing a duplication-distribution-based hysteresis re-chunking strategy. MHD harnesses the metadata by dynamically merging multiple non-duplicate chunks into one big chunk represented by one hash value while dividing big chunks straddling duplicate and non-duplicate data regions into small chunks represented with multiple hashes. Experimental results show that the proposed algorithm achieves a lower metadata overhead and a higher deduplication throughput for a given duplication elimination ratio, as compared with other state-of-the-art algorithms such as the Bimodal, Sub Chunk and Sparse Indexing algorithms.
机译:与元数据相关的开销可能会严重影响数据重复数据删除系统的性能,包括实际重复数据消除率和重复数据删除吞吐量。产生的元数据的数量主要由输入数据流的分块机制确定。在本文中,我们提出了一种基于复制分布的滞后重分块策略的元数据利用重复数据删除(MHD)算法。 MHD利用元数据,将多个非重复数据块动态合并为一个哈希值表示的一个大数据块,同时将跨越重复数据和非重复数据区域的大数据块划分为由多个哈希表示的小数据块。实验结果表明,与双峰,子块和稀疏索引算法等其他最新算法相比,该算法在给定的重复消除率下实现了更低的元数据开销和更高的重复数据删除吞吐量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号