Hysteresis Re-chunking Based Metadata Harnessing Deduplication of Disk Images

机译：基于磁滞重分块的元数据利用磁盘映像重复数据删除

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Metadata-related overhead can significantly impact the performance of data deduplication systems, including the real duplication elimination ratio and the deduplication throughput. The amount of metadata produced is mainly determined by the chunking mechanism for the input data stream. In this paper, we propose a metadata harnessing deduplication (MHD) algorithm utilizing a duplication-distribution-based hysteresis re-chunking strategy. MHD harnesses the metadata by dynamically merging multiple non-duplicate chunks into one big chunk represented by one hash value while dividing big chunks straddling duplicate and non-duplicate data regions into small chunks represented with multiple hashes. Experimental results show that the proposed algorithm achieves a lower metadata overhead and a higher deduplication throughput for a given duplication elimination ratio, as compared with other state-of-the-art algorithms such as the Bimodal, Sub Chunk and Sparse Indexing algorithms.

机译：与元数据相关的开销可能会严重影响数据重复数据删除系统的性能，包括实际重复数据消除率和重复数据删除吞吐量。产生的元数据的数量主要由输入数据流的分块机制确定。在本文中，我们提出了一种基于复制分布的滞后重分块策略的元数据利用重复数据删除（MHD）算法。 MHD利用元数据，将多个非重复数据块动态合并为一个哈希值表示的一个大数据块，同时将跨越重复数据和非重复数据区域的大数据块划分为由多个哈希表示的小数据块。实验结果表明，与双峰，子块和稀疏索引算法等其他最新算法相比，该算法在给定的重复消除率下实现了更低的元数据开销和更高的重复数据删除吞吐量。

著录项

来源
《International Conference on Parallel Processing》|2013年|389-398|共10页
会议地点
作者
Zhou Bing; Wen Jiangtao;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Data Deduplication; Metadata Harnessing;

机译：重复数据删除;元数据利用;

相似文献

外文文献
中文文献
专利

1. A Data Deduplication Framework of Disk Images with Adaptive Block Skipping [J] . Bing Zhou, Jiang-Tao Wen 计算机科学技术学报（英文版） . 2016,第004期

机译：具有自适应块跳过的磁盘映像数据重复数据删除框架
2. Improving Metadata Caching Eﬃciency for Data Deduplication via In-RAM Metadata Utilization [J] . Bing Zhou, Jiang-Tao Wen 计算机科学技术学报（英文版） . 2016,第004期

机译：通过RAM中的元数据利用来提高重复数据删除的元数据缓存效率
3. An unsupervised heuristic-based approach for bibliographic metadata deduplication [J] . Eduardo N. Borges, Moises G. de Carvalho, Renata Galante, Information Processing & Management . 2011,第5期

机译：书目元数据重复数据删除的无监督启发式方法
4. Hysteresis Re-chunking Based Metadata Harnessing Deduplication of Disk Images [C] . Zhou Bing, Wen Jiangtao International Conference on Parallel Processing . 2013

机译：基于磁盘图像重复数据删除的基于磁滞基的元数据
5. Data content mining: Extracting and cataloging content-based metadata from satellite images (remote sensing). [D] . Harberts, Robert Lawrence. 1996

机译：数据内容挖掘：从卫星图像中提取和分类基于内容的元数据（遥感）。
6. A 4DCT imaging-based breathing lung model with relative hysteresis [O] . Shinjiro Miyawaki, Sanghun Choi, Eric A. Hoffman, -1

机译：具有相对滞后的基于4DCT成像的呼吸肺模型
7. The Effectiveness of Deduplication on Virtual Machine Disk Images [O] . Keren Jin, Ethan L. Miller 2010

机译：虚拟机磁盘映像上重复数据删除的有效性

Hysteresis Re-chunking Based Metadata Harnessing Deduplication of Disk Images

摘要

著录项

相似文献

相关主题

期刊订阅