首页> 外文期刊>Journal of supercomputing >Towards a delivery scheme for speedup of data backup in distributed storage systems using erasure codes
【24h】

Towards a delivery scheme for speedup of data backup in distributed storage systems using erasure codes

机译:迈向使用擦除代码加快分布式存储系统中数据备份速度的交付方案

获取原文
获取原文并翻译 | 示例
       

摘要

Distributed storage systems, built on peer-to-peer networks, can provide large-scale data storage and high data reliability by redundancy. Data backup is the process to store data into a set of redundant storage nodes. Rapid completion of such a process is very critical to maintain system performance. In traditional data backup in distributed systems based on erasure codes, star-structured scheme is used, in which each redundant block is just sent to each target storage node from the source node directly, so the storage throughput and delay are limited by the bottleneck bandwidth, due to bandwidth heterogeneity. The recent in-network redundancy generation scheme uses locally repairable property of self-repairing codes to speed up data backup. However, such kind of code does not own maximum distance separable property, thus does not achieve optimal storage efficiency. We still lack a fast backup scheme in distributed systems based on general erasure coding. To this end, we proposed that instead of only focusing on bandwidths between the source node and target nodes, the bandwidths between target storage nodes should be fully taken into account. In our scheme, each redundant data block is divided into some parts according to different proportions and each part of the block is sent to the target storage node via other different storage nodes. The benefit is that spare bandwidths between target storage nodes are used to reduce backup time. We further show how this process can be modeled and derive a formula about the final backup time. We can achieve minimum backup time by solution for classical quadratic programming problem. We conduct both numerical analysis and experimental study. Our experiments shows, the delay reduces 59%, compared with common star-structured scheme. Meanwhile, the throughput is increased significantly in backup process.
机译:建立在对等网络上的分布式存储系统可以通过冗余提供大规模数据存储和高数据可靠性。数据备份是将数据存储到一组冗余存储节点中的过程。快速完成此过程对于维持系统性能至关重要。在基于擦除码的分布式系统中的传统数据备份中,采用星型结构方案,其中每个冗余块仅直接从源节点发送到每个目标存储节点,因此存储吞吐量和延迟受到瓶颈带宽的限制,由于带宽异构。最近的网络内冗余生成方案使用自修复代码的本地可修复属性来加快数据备份。但是,这种代码不具有最大距离可分离性,因此不能实现最佳的存储效率。我们仍然缺乏基于通用擦除编码的分布式系统中的快速备份方案。为此,我们提出,不仅要关注源节点和目标节点之间的带宽,还应充分考虑目标存储节点之间的带宽。在我们的方案中,每个冗余数据块根据不同的比例分为一些部分,并且该块的每个部分都通过其他不同的存储节点发送到目标存储节点。好处是目标存储节点之间的备用带宽可用来减少备份时间。我们还将进一步说明如何对这一过程进行建模,并得出有关最终备份时间的公式。通过解决经典二次规划问题,我们可以实现最少的备份时间。我们进行数值分析和实验研究。我们的实验表明,与常见的星形结构方案相比,延迟减少了59%。同时,在备份过程中吞吐量显着提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号