The HDFS distributed file system, with RackAwareness replica placement strategy, ensures the reliability of the data in a certain extent. But the data distribution will be unbalanced after the system runs for a period of time. Although the usage of Balancer program could redistribute tha data, the postposition of unbalanced treatment of the data storage affects the data read rate and reliability of the system. This paper adopts a replica placement strategy, which is based on multi-layer consistent hashing. At first, we will get the position of the frame which corresponds to replica through the consistent hashing algorithm, and then with the consistent hashing algorithm, we will get datanode position which is under the frame, finally, becoming the storage location. Consistent hashing algorithm uses the equal-sized partitions technology and the virtual node technology in the process of searching the corresponding position, which improves the search efficiency and the balance of distribution. The strategy, used in the data equilibrium storage and the upload rate, has greatly improved than the original one. Besides, it has the ability of replicas adaptability.%分布式文件系统HDFS采用机架感知的副本放置策略在一定程度上保证了数据的可靠性,但系统运行一段时间后会出现数据分布不均衡的情况。虽然使用Balancer程序可以对数据进行重分布,但对数据存储不均衡处理的后置性影响了系统的数据读取速率和可靠性。采用多层一致性哈希的副本放置策略,首先通过一致性哈希算法获得数据副本对应的机架位置,再通过一致性哈希算法获得该机架下对应的数据节点位置并最终成为存储位置。一致性哈希算法在查找对应位置的过程中采用地址等分和虚拟节点的技术,提高了查找的效率和分布的均衡性。该策略在数据均衡存储、上传速率方面较原有策略都有很大的提高,并且具有数据自适应性的能力。
展开▼