首页> 外文会议>International Conference on Computing, Networking and Communications >DS-Dedupe: A scalable, low network overhead data routing algorithm for inline cluster deduplication system
【24h】

DS-Dedupe: A scalable, low network overhead data routing algorithm for inline cluster deduplication system

机译:DS-DEDUPE:一个可扩展的低网络架空数据路由算法,用于内联群集重复数据删除系统

获取原文
获取外文期刊封面目录资料

摘要

Inline cluster deduplication technique has been widely used in data centers to improve storage efficiency. Data routing algorithm has a crucial impact on the deduplication factor, throughput and scalability in a cluster deduplication system. In this paper, we propose a stateful data routing algorithm called DS-Dedupe. To make full use of similarity in data streams, DS-Dedupe builds up a super-chunk granularity similarity index in each client to trace the super-chunks that have been routed. Then we calculate a similarity coefficient according to the index to determine whether a new super-chunk should be assigned directly or by a consistent hash, thus strike a sensible tradeoff between deduplication factor and network overhead. Our experiments on two datasets demonstrate that DS-Dedupe achieves a high elimination ratio at a low communication overhead. Besides, as data routing is operated by client node, metadata server bottleneck can be avoided.
机译:内联群集重复数据删除技术已广泛用于数据中心以提高存储效率。数据路由算法对集群重复数据删除系统中的重复数据删除因子,吞吐量和可扩展性具有至关重要的影响。在本文中,我们提出了一种称为DS-DEDUPE的有状态数据路由算法。为了充分利用数据流中的相似性,DS-DEDUPE在每个客户端中构建了一个超级块粒度相似性索引以跟踪已路由的超级块。然后,我们根据索引计算相似系数,以确定是否应该直接或通过一致的散列分配新的超级块,从而在重复数据删除因子和网络开销之间击打明智的权衡。我们在两个数据集上的实验表明DS-DEDUPE在低通信开销时实现了高消除比率。此外,由于数据路由由客户端节点操作,可以避免元数据服务器瓶颈。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号