首页> 外文期刊>Connection Science >A feature-based intelligent deduplication compression system with extreme resemblance detection
【24h】

A feature-based intelligent deduplication compression system with extreme resemblance detection

机译:一种基于特征的智能重复数据删除压缩系统,具有极其相似的检测

获取原文
获取原文并翻译 | 示例
           

摘要

With the fast development of various computing paradigms, the amount of data is rapidly increasing that brings the huge storage overhead. However, the existing data deduplication techniques do not make full use of similarity detection to improve the storage efficiency and data transmission rate. In this paper, we study the problem of utilising the duplicate and resemblance detection techniques to further compress data. We first present a framework of FIDCS-ERD, a feature-based intelligent deduplication compression system with extreme resemblance detection. We also introduce the main components and the detailed workflow of our compression system. We propose a content-defined chunking algorithm for duplicate detection and a Bloom filter-based resemblance detection algorithm. FIDCS-ERD implements the intelligent file chunking and the fast duplicate and resemblance detection. By extensive experiments over the real datasets, we demonstrate that FIDCS-ERD has better compression effect and more accurate resemblance detection compared to the existing approaches.
机译:随着各种计算范式的快速发展,数据量正在迅速增加,从而带来巨大的存储开销。然而,现有数据重复数据删除技术不充分利用相似性检测来提高存储效率和数据传输速率。在本文中,我们研究了利用副本和相似检测技术来进一步压缩数据的问题。我们首先介绍FIDCS-ERD的框架,一种基于特征的智能重复数据删除压缩系统,具有极其相似的检测。我们还介绍了主要组件和压缩系统的详细工作流程。我们提出了一种用于复制检测的内容定义的块算法和基于绽放的基于滤波器的相似性检测算法。 FIDCS-ERD实现智能文件块和快速重复和相似检测。通过对真实数据集的大量实验,我们证明FIDCS-ERD与现有方法相比,FIDCS-ERD具有更好的压缩效果和更准确的相似性检测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号