首页> 外文期刊>Computing and informatics >DATA DE-DUPLICATION WITH ADAPTIVE CHUNKING AND ACCELERATED MODIFICATION IDENTIFYING
【24h】

DATA DE-DUPLICATION WITH ADAPTIVE CHUNKING AND ACCELERATED MODIFICATION IDENTIFYING

机译:具有自适应校验和加速修改标识的数据去重复

获取原文
获取原文并翻译 | 示例
           

摘要

The data de-duplication system not only pursues the high de-duplication rate, which refers to the aggregate reduction in storage requirements gained from de-duplication, but also the de-duplication speed. To solve the problem of random parameter-setting brought by Content Defined Chunking (CDC), a self-adaptive data chunking algorithm is proposed. The algorithm improves the de-duplication rate by conducting pre-processing de-duplication to the samples of the classified files and then selecting the appropriate algorithm parameters. Meanwhile, FastCDC, a kind of content-based fast data chunking algorithm, is adopted to solve the problem of low de-duplication speed of CDC. By introducing de-duplication factor and acceleration factor, FastCDC can significantly boost de-duplication speed while not sacrificing the de -duplication rate through adjusting these two parameters. The experimental results demonstrate that our proposed method can improve the de -duplication rate by about 5 %, while FastCDC can obtain the increase of de -duplication speed by 50 % to 200 % only at the expense of less than 3 % de duplication rate loss.
机译:数据重复数据删除系统不仅追求高重复数据删除率,这意味着从重复数据删除中获得的存储需求的总体减少,而且也意味着重复数据删除的速度。针对内容定义块(CDC)带来的随机参数设置问题,提出了一种自适应数据分块算法。通过对分类文件的样本进行预处理重复数据删除,然后选择适当的算法参数,该算法提高了重复数据删除率。同时,FastCDC是一种基于内容的快速数据分块算法,用于解决CDC的重复数据删除速度低的问题。通过引入重复数据删除因子和加速因子,FastCDC可以通过调整这两个参数来显着提高重复数据删除速度,同时又不牺牲重复数据删除率。实验结果表明,我们提出的方法可以将重复数据删除率提高约5%,而FastCDC仅以小于3%的重复数据删除率为代价即可将重复数据删除速度提高50%至200%。 。

著录项

  • 来源
    《Computing and informatics》 |2016年第3期|586-614|共29页
  • 作者单位

    Xi An Jiao Tong Univ, Dept Comp Sci & Technol, Xian 710049, Peoples R China;

    Xi An Jiao Tong Univ, Dept Comp Sci & Technol, Xian 710049, Peoples R China;

    Inspur Beijing Elect Informat Ind Co Ltd, Beijing 100085, Peoples R China;

    Linkoping Univ, Dept Sci & Technol, Campus Norrkoping, SE-60174 Linkoping, Sweden;

    Xi An Jiao Tong Univ, Dept Comp Sci & Technol, Xian 710049, Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Data de-duplication; self-adaptive; FastCDC;

    机译:重复数据删除;自适应;FastCDC;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号