首页> 外文会议>Joint International Advanced Engineering and Technology Research Conference >An optimization strategy of massive small files storage based on HDFS
【24h】

An optimization strategy of massive small files storage based on HDFS

机译:基于HDFS的大规模小文件存储优化策略

获取原文

摘要

Nowadays, Hadoop distributed file system as a distributed storage system, has a good effect on the storage of large files. However, there is a natural flaw in the storage of small files: storing a large number of small files will produce excessive metadata, resulting in namenode memory bottlenecks; frequent RPC communications will cause time consumption due to over-provisioning. To solve these problems, this paper presents a merging algorithm based on two factors: the distribution of files and the correlation of files. The algorithm can not only reduce the HDFS blocks, but also make relevant files close. Experimental results show that the algorithm effectively improves the storage efficiency of HDFS on small files and help to optimize the access of small files.
机译:如今,Hadoop分布式文件系统作为分布式存储系统,对大文件的存储有很好的影响。但是,小文件存储中存在自然缺陷:存储大量的小文件将产生过多的元数据,从而产生NameNode内存瓶颈;频繁的RPC通信将导致由于过度配置而导致的时间消耗。为了解决这些问题,本文提出了一种基于两个因素的合并算法:文件分发和文件的相关性。算法不仅可以减少HDFS块,还可以关闭相关文件。实验结果表明,该算法有效提高了小文件上HDFS的存储效率,并有助于优化小文件的访问。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号