【24h】

DDFLASKS: Deduplicated Very Large Scale Data Store

机译:DDFLASKS:重复数据删除的超大规模数据存储

获取原文

摘要

With the increasing number of connected devices, it becomes essential to find novel data management solutions that can leverage their computational and storage capabilities. However, developing very large scale data management systems requires tackling a number of interesting distributed systems challenges, namely continuous failures and high levels of node churn. In this context, epidemic-based protocols proved suitable and effective and have been successfully used to build DataFlasks, an epidemic data store for massive scale systems. Ensuring resiliency in this data store comes with a significant cost in storage resources and network bandwidth consumption. Deduplication has proven to be an efficient technique to reduce both costs but, applying it to a large-scale distributed storage system is not a trivial task. In fact, achieving significant space-savings without compromising the resiliency and decentralized design of these storage systems is a relevant research challenge. In this paper, we extend DataFlasks with deduplication to design DDFlasks. This system is evaluated in a real world scenario using Wikipedia snapshots, and the results are twofold. We show that deduplication is able to decrease storage consumption up to 63% and decrease network bandwidth consumption by up to 20%, while maintaining a fully-decentralized and resilient design.
机译:随着连接设备数量的增加,寻找可以利用其计算和存储功能的新型数据管理解决方案变得至关重要。但是,开发非常大规模的数据管理系统需要应对许多有趣的分布式系统挑战,即连续故障和高级别节点搅动。在这种情况下,基于流行病的协议被证明是合适且有效的,并已成功用于构建DataFlasks,DataFlasks是大规模系统的流行病数据存储。确保此数据存储的弹性会带来巨大的存储资源和网络带宽消耗成本。重复数据删除已被证明是降低这两种成本的有效技术,但是将其应用于大规模分布式存储系统并不是一件容易的事。实际上,在不损害这些存储系统的弹性和分散式设计的情况下实现显着的空间节省是一项相关的研究挑战。在本文中,我们将具有重复数据删除功能的DataFlasks扩展为DDFlasks设计。使用Wikipedia快照在现实世界中对该系统进行了评估,结果是双重的。我们证明,重复数据删除能够将存储消耗减少多达63%,并将网络带宽消耗减少多达20%,同时又保持了完全分散和弹性的设计。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号