首页> 外文会议>Data Compression Conference >Combining Deduplication and Delta Compression to Achieve Low-Overhead Data Reduction on Backup Datasets
【24h】

Combining Deduplication and Delta Compression to Achieve Low-Overhead Data Reduction on Backup Datasets

机译:结合重复数据删除和三角形压缩,实现备份数据集的低开销数据减少

获取原文

摘要

Data reduction has become increasingly important in storage systems due to the explosive growth of digital data in the world that has ushered in the big data era. In this paper, we present DARE, a Deduplication-Aware Resemblance detection and Elimination scheme for compressing backup datasets that effectively combines data deduplication and delta compression to achieve high data reduction efficiency at low overhead. The main idea behind DARE is to employ a scheme, call Duplicate-Adjacency based Resemblance Detection (DupAdj), by considering any two data chunks to be similar (i.e., candidates for delta compression) if their respective adjacent data chunks are found to be duplicate in a deduplication system, and then further enhance the resemblance detection efficiency by an improved super-feature approach. Our experimental results based on real-world and synthetic backup datasets show that DARE achieves an additional data reduction by a factor of more than 2 (2X) on top of deduplication with very low overhead while nearly doubling the data restore performance of deduplication-only systems by supplementing delta compression to deduplication.
机译:由于世界上已迎来大数据时代的数字数据的爆炸性增长,数据减少在存储系统中变得越来越重要。在本文中,我们敢于,一种重复数据删除感知的相似性检测和消除方案,用于压缩备份数据集,这些数据集有效地结合了数据重复数据删除和增量压缩,以实现低开销的高数据降低效率。敢于采用一个方案,通过考虑任何两个数据块(即,Delta压缩的候选者),呼叫重复相邻的相互相邻的相似性检测(Dupadj),如果发现它们各自的相邻数据块重复在重复数据删除系统中,然后通过改进的超特征方法进一步提高相似的检测效率。我们基于现实世界和综合备份数据集的实验结果表明,敢于在重复数据删除的顶部额外的数据减少超过2(2倍),同时具有非常低的开销,同时几乎加倍Deaplication Systems的数据恢复性能通过补充Delta压缩重复数据删除。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号