首页>
外国专利>
DISTRIBUTED DEDUPLICATION USING LOCALITY SENSITIVE HASHING
DISTRIBUTED DEDUPLICATION USING LOCALITY SENSITIVE HASHING
展开▼
机译:使用本地敏感哈希的分布式重复数据删除
展开▼
页面导航
摘要
著录项
相似文献
摘要
Deduplication in a distributed storage system is described. A deduplication manager identifies a data item that includes multiple data chunks. The deduplication manager defines a first extent on a first node in a distributed storage system. The deduplication manager compares the first extent to existing groups of similar extents to find one of the existing groups that has extents that are similar to the first extent. The deduplication manager selects a second extent from the found group of extents. The second closely matches the first extent and removes from the first extent one or more data chunks that are included in the first extent and the second extent. The deduplication manager associates, with the first extent, a pointer to the second extent for the removed one or more data chunks.
展开▼