【24h】

A novel approach for efficient accessing of small files in HDFS: TLB-MapFile

机译:一种用于高效访问HDFS中小文件的新方法:TLB-MapFile

获取原文
获取外文期刊封面目录资料

摘要

Hadoop distributed file system (HDFS) was originally designed for streaming access large files, but the access and storage efficiency is low for the mass small files. This paper presents an access optimization approach for HDFS small file based on MapFile: TLB-MapFile. TLB-MapFile merges massive small files into large files by MapFile mechanism to reduce NameNode memory consumption and add fast table structure (TLB) in DataNode, and to improve retrieval efficiency of small files. First, according to MapFile mechanism, small files are merged into large files and stored in HDFS. Second, the access frequency and the ordered queue of small files (per unit time) can be obtained through accessing system audit logs in HDFS, and the mapping information between block and small files are stored in the TLB table with regularly being updated. TLB-MapFile improves access efficiency of small files through the prefetching of priori strategies based on TLB table. Experiment results show that this method can effectively reduce NameNode memory consumption and improve the reading speed of small files.
机译:Hadoop分布式文件系统(HDFS)最初是为流式访问大文件而设计的,但对大规模小文件的访问和存储效率低。本文介绍了基于MapFile:TLB-MapFile的HDFS小文件的访问优化方法。 TLB-MapFile通过MapFile机制将大量的小文件合并到大型文件中,以减少DataNode中的NameNode内存消耗并添加快速表结构(TLB),并提高小文件的检索效率。首先,根据MapFile机制,小文件合并为大文件并存储在HDFS中。其次,通过访问HDF中的系统审核日志可以获得访问频率和小文件的排序队列(每单位时间),并且块和小文件之间的映射信息存储在TLB表中,定期更新。 TLB-MapFile通过基于TLB表的预取策略来提高小文件的访问效率。实验结果表明,该方法可以有效地降低NameNode内存消耗并提高小文件的读取速度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号