首页> 外国专利> Techniques for de-duplicating data storage systems using a segmented index

Techniques for de-duplicating data storage systems using a segmented index

机译:使用分段索引对数据存储系统进行重复数据删除的技术

摘要

Techniques have been provided for storing data in a de-duplicated manner on a data storage system in a manner that allows for real-time reference to an index that is too large to fit within memory. This may be accomplished by segmenting the index into smaller segments, stored on disk. Only a subset of the segments may be loaded into memory at a given time. A predictive filter is stored in memory for each segment, allowing a de-duplication driver to quickly predict whether any given new block is likely to be indexed by each segment. Since identical blocks are often stored in long identical sequences (e.g., upon copying a disk image to a disk for a virtual machine), once a segment stored on disk is referenced many times in a short period, it is loaded into memory to allow the remainder of the long sequence to be de-duplicated.
机译:已经提供了用于以允许实时引用太大而无法容纳在存储器中的索引的方式在数据存储系统上以重复数据删除的方式存储数据的技术。这可以通过将索引分成存储在磁盘上的较小段来实现。在给定的时间,只有一部分的子集可以被加载到内存中。预测过滤器存储在每个段的内存中,从而允许重复数据消除驱动程序快速预测每个段是否可能索引任何给定的新块。由于相同的块通常以相同的长序列存储(例如,将磁盘映像复制到虚拟机的磁盘上),因此一旦在短时间内多次引用了磁盘上存储的段,就将其加载到内存中以允许要删除的长序列的其余部分。

著录项

  • 公开/公告号US10614036B1

    专利类型

  • 公开/公告日2020-04-07

    原文格式PDF

  • 申请/专利权人 EMC IP HOLDING COMPANY LLC;

    申请/专利号US201615394376

  • 申请日2016-12-29

  • 分类号G06F16;G06F16/174;G06F11/14;G06F16/13;G06F16/14;

  • 国家 US

  • 入库时间 2022-08-21 11:26:37

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号