首页> 外文会议>International Conference on the Design of Reliable Communication Networks >On improving recovery performance in erasure code based geo-diverse storage clusters
【24h】

On improving recovery performance in erasure code based geo-diverse storage clusters

机译:在基于纠删码的地域多元化存储集群中提高恢复性能

获取原文

摘要

Erasure code based distributed storage systems are increasingly being used by storage providers for big data storage since they offer the same reliability as replication with significant decrease in the amount of storage required. But, when it comes to a storage system with data nodes spread across a very large geographical area, the code's recovery performance is affected by various factors, both network and computation related. In this paper, we expose an XOR based code supplemented with the ideas of parity duplication and rack awareness that could be adopted in such storage clusters to improve the recovery performance during node failures. We have implemented them on the erasure code module of the XORBAS version of the Hadoop Distributed File System (HDFS). For evaluating the performance of the proposed ideas, we employ a geo-diverse cluster on the NeCTAR research cloud. The experimental results show that the techniques aid in bringing down the data read for repair by a factor of 85% and repair duration by a factor of 57% during node failures, though resulting in an increased storage requirement of 21% as compared to the traditional Reed-Solomon codes used in HDFS. The sum of all these ideas could offer a better solution for a code based storage system spanning a wide geographical area that has storage constraints such that a triple replicated system is not affordable and at the same time has strict requirements on ensuring the minimal recovery time.
机译:基于擦除代码的分布式存储系统正越来越多地被存储提供商用于大数据存储,因为它们具有与复制相同的可靠性,同时所需的存储量也大大减少。但是,当涉及到数据节点分布在非常大的地理区域中的存储系统时,代码的恢复性能会受到与网络和计算相关的各种因素的影响。在本文中,我们展示了基于XOR的代码,该代码补充了奇偶校验复制和机架感知的思想,可以在此类存储群集中采用这些代码,以提高节点故障时的恢复性能。我们已经在Hadoop分布式文件系统(HDFS)的XORBAS版本的擦除代码模块上实现了它们。为了评估提出的建议的效果,我们在NeCTAR研究云上采用了地域多样的集群。实验结果表明,该技术在节点故障期间有助于将读取的维修数据降低了85%,修复持续时间降低了57%,尽管与传统方法相比,存储需求增加了21% HDFS中使用的Reed-Solomon代码。所有这些想法的总和可以为跨越存储限制的广泛地理区域的基于代码的存储系统提供更好的解决方案,使得三重复制系统负担不起,同时对确保最短恢复时间有严格的要求。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号