首页> 外文期刊>Computer Science and Information Systems >An optimized method of HDFS for massive small files storage
【24h】

An optimized method of HDFS for massive small files storage

机译:HDFS用于海量小文件存储的优化方法

获取原文
       

摘要

The development of the Internet-of-Things (IoT) and the Cyber-Physical System (CPS) has greatly facilitated many aspects of technological applications and development. This may lead to significant data growth, especially for small files. The analysis and processing of a large number of small files has become a crucial part of the development of IoT and CPS. Hadoop Distributed File Systems have become powerful platforms to store a larger amount of big data. However, this method has a number of issues when dealing with small files, such as substantial memory consumption and poor access. In this paper, a Dynamic Queue of Small Files (DQSF) algorithm is proposed to solve these problems. DQSF differentiates small files into different categories using an analytical hierarchal process that examines the performance of small files with different ranges across four indexes and determines the size of the dynamic queue according to the best system performance. Additionally, period classification is applied to preprocess the small files before storage, and the prefetching mechanism of the secondary index is used to process index tables. Experimental results show that this method could effectively reduce memory use and improve the storage efficiency of massive small files, which optimizes system performance.
机译:物联网(IoT)和网络物理系统(CPS)的发展极大地促进了技术应用和开发的许多方面。这可能会导致大量数据增长,尤其是对于小文件。大量小文件的分析和处理已成为IoT和CPS发展的关键部分。 Hadoop分布式文件系统已经成为存储大量大数据的强大平台。但是,此方法在处理小文件时存在许多问题,例如大量内存消耗和较差的访问。本文提出了一种小文件动态队列(DQSF)算法来解决这些问题。 DQSF使用分析层次结构过程将小文件分为不同的类别,该过程检查四个索引范围不同的小文件的性能,并根据最佳系统性能确定动态队列的大小。此外,使用周期分类在存储之前对小文件进行预处理,并且使用二级索引的预取机制来处理索引表。实验结果表明,该方法可以有效减少内存使用,提高海量小文件的存储效率,优化系统性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号