首页> 外文会议>IEEE International Conference on Cluster Computing >Exploring the Potential of Fast Delta Encoding: Marching to a Higher Compression Ratio
【24h】

Exploring the Potential of Fast Delta Encoding: Marching to a Higher Compression Ratio

机译:探索快速增量编码的潜力:迈向更高的压缩率

获取原文

摘要

Delta compression (or called delta encoding) is a data reduction technique capable of calculating the differences (i.e., delta) among the very similar files and chunks, and is thus widely used for optimizing synchronization replication, backup/archival storage, cache compression, etc. However, delta compression is costly because of its time-consuming word-matching operations for delta calculation. Existing delta encoding approaches, are either at a slow encoding speed, such as Xdelta and Zdelta, or at a low compression ratio, such as Ddelta and Edelta. In this paper, we propose Gdelta, a fast delta encoding approach with a high compression ratio, that improves the delta encoding speed by employing an improved fast Gear-based rolling hash for scanning fine-grained words, and a quick array-based indexing scheme for word-matching, and then, after word-matching, further batch compressing the rest to improve the compression ratio. Our evaluation results driven by six real-world datasets suggest that Gdelta achieves encoding/decoding speedups of 2X∼4X over the classic Xdelta and Zdelta approaches while increasing the compression ratio by about 10%∼120%.
机译:增量压缩(或称为增量编码)是一种数据缩减技术,能够计算非常相似的文件和块之间的差异(即增量),因此被广泛用于优化同步复制,备份/档案存储,缓存压缩等。然而,增量压缩由于用于增量计算的费时的字匹配操作而成本很高。现有的增量编码方法要么是慢速的编码速度(例如Xdelta和Zdelta),要么是低压缩率(例如Ddelta和Edelta)。在本文中,我们提出了一种Gdelta,它是一种具有高压缩率的快速增量编码方法,它通过使用改进的基于Gear的快速滚动哈希来扫描细粒度单词来提高增量编码速度,以及一种基于数组的快速索引方案进行单词匹配,然后在单词匹配之后,进一步批量压缩其余部分以提高压缩率。我们的评估结果由六个真实世界的数据集得出,与传统的Xdelta和Zdelta方法相比,Gdelta实现了2X〜4X的编码/解码加速,同时将压缩率提高了约10%〜120%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号