Improving Data Availability in HDFS through Replica Balancing

机译：通过副本平衡提高HDFS中的数据可用性

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Over time, the data distribution across an HDFS cluster may become unbalanced. The HDFS Balancer is a tool provided by Apache Hadoop that redistributes blocks by moving them from nodes with higher utilization to nodes with lower utilization. However, during block rearrangement, the HDFS Balancer does not aim to increase the availability of the data. This work presents a strategy that gives priority to block movements which increase the overall availability of the data stored in the HDFS. Thereby, increasing the fault tolerance as placing blocks in a higher number of racks tends to reduce the chances of data loss. In order to evaluate the implementation, an experimental investigation has been conducted to measure the system performance after balancing the cluster with the proposed solution.

机译：随着时间的流逝，整个HDFS群集上的数据分布可能会变得不平衡。 HDFS Balancer是Apache Hadoop提供的工具，可通过将块从利用率较高的节点移动到利用率较低的节点来重新分配块。但是，在块重新排列期间，HDFS Balancer并非旨在提高数据的可用性。这项工作提出了一种策略，该策略优先考虑块移动，这增加了HDFS中存储的数据的整体可用性。从而，随着将块放置在更多数量的机架中而增加了容错能力，倾向于减少数据丢失的机会。为了评估实现，已进行了实验研究，以在将群集与提出的解决方案平衡后测量系统性能。

著录项

来源
《Latin-American Symposium on Dependable Computing》|2019年|1-6|共6页
会议地点
作者
Rhauani Weber Aita Fazul; Paulo Vinicius Cardoso; Patrícia Pitthan Barcelos;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
data handling; distributed databases; fault tolerance; parallel processing; storage management;

机译：数据处理;分布式数据库;容错;并行处理;存储管理;

相似文献

外文文献
中文文献
专利

1. OPTIMIZATION OF TRANSACTION PROCESSING IN CLOUD DATABASES THROUGH A DATACENTER SYSTEM DESIGN CRACS TO IMPROVE REPLICA CONSISTENCY, AVAILABILITY AND SCALABILITY [J] . R. ANANDHI, DR. K. CHITRA Journal of Theoretical and Applied Information Technology . 2014,第3期

机译：通过数据中心系统设计方案优化云数据库中的事务处理，以提高副本的一致性，可用性和可扩展性
2. A HDFS dynamic load balancing strategy using improved niche PSO algorithm in cloud storage [J] . Zhiyu Jian, Yiwei Jian International journal of autonomous and adaptive communications systems . 2021,第1a2期

机译：使用改进的云存储中的利基PSO算法的HDFS动态负载平衡策略
3. An improved query optimization process in big data using ACO-GA algorithm and HDFS map reduce technique [J] . Kumar Deepak, Jha Vijay Kumar Distributed and Parallel Databases . 2021,第1期

机译：使用ACO-GA算法和HDFS地图的大数据中的改进查询优化过程
4. Improving Data Availability in HDFS through Replica Balancing [C] . Rhauani Weber Aita Fazul, Paulo Vinicius Cardoso, Patrícia Pitthan Barcelos Latin-American Symposium on Dependable Computing . 2019

机译：通过副本平衡提高HDF中的数据可用性
5. Mitigating the impact of correlated hardware failure on data availability through survivable replica placement [D] . Mills, K. Alex 2013

机译：通过可生存的副本放置来减轻相关硬件故障对数据可用性的影响
6. Embedding problem solving and use of data with routine supply chain procedures: District leadership and team-based approaches improve product availability in Rwanda [O] . Alexis Heaton, Amanda Ombeva, Deogratias Leopold, 2014

机译：将问题解决和数据使用与常规供应链程序结合在一起：地区领导和基于团队的方法提高了卢旺达的产品可用性
7. Improving Data Transfer Rate and Throughput of HDFS using Efficient Replica Placement [O] . Neha M Patel, Narendra M Patel, Mosin I Hasan, 2014

机译：使用高效副本放置提高HDFs的数据传输速率和吞吐量
8. Aviation Safety: Enhanced Oversight and Improved Availability of Risk-Based Data Could Further Improve Safety [R] . 2011

机译：航空安全：加强监督和提高基于风险的数据的可用性可以进一步提高安全性

Improving Data Availability in HDFS through Replica Balancing

摘要

著录项

相似文献

相关主题

期刊订阅