An optimized method of HDFS for massive small files storage

Jing Weipeng; Tong Danyu; Chen GuangSheng; Zhao Chuanyu; Zhu LiangKuan

首页> 外文期刊>Computer Science and Information Systems >An optimized method of HDFS for massive small files storage

【24h】

An optimized method of HDFS for massive small files storage

机译：HDFS用于海量小文件存储的优化方法

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The development of the Internet-of-Things (IoT) and the Cyber-Physical System (CPS) has greatly facilitated many aspects of technological applications and development. This may lead to significant data growth, especially for small files. The analysis and processing of a large number of small files has become a crucial part of the development of IoT and CPS. Hadoop Distributed File Systems have become powerful platforms to store a larger amount of big data. However, this method has a number of issues when dealing with small files, such as substantial memory consumption and poor access. In this paper, a Dynamic Queue of Small Files (DQSF) algorithm is proposed to solve these problems. DQSF differentiates small files into different categories using an analytical hierarchal process that examines the performance of small files with different ranges across four indexes and determines the size of the dynamic queue according to the best system performance. Additionally, period classification is applied to preprocess the small files before storage, and the prefetching mechanism of the secondary index is used to process index tables. Experimental results show that this method could effectively reduce memory use and improve the storage efficiency of massive small files, which optimizes system performance.

机译：物联网（IoT）和网络物理系统（CPS）的发展极大地促进了技术应用和开发的许多方面。这可能会导致大量数据增长，尤其是对于小文件。大量小文件的分析和处理已成为IoT和CPS发展的关键部分。 Hadoop分布式文件系统已经成为存储大量大数据的强大平台。但是，此方法在处理小文件时存在许多问题，例如大量内存消耗和较差的访问。本文提出了一种小文件动态队列（DQSF）算法来解决这些问题。 DQSF使用分析层次结构过程将小文件分为不同的类别，该过程检查四个索引范围不同的小文件的性能，并根据最佳系统性能确定动态队列的大小。此外，使用周期分类在存储之前对小文件进行预处理，并且使用二级索引的预取机制来处理索引表。实验结果表明，该方法可以有效减少内存使用，提高海量小文件的存储效率，优化系统性能。

著录项

来源
《Computer Science and Information Systems》 |2018年第3期|共16页
作者
Jing Weipeng; Tong Danyu; Chen GuangSheng; Zhao Chuanyu; Zhu LiangKuan;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类图书馆学、图书馆事业;
关键词
WSNHDFSmassive small filesDynamic QueueAnalytic Hierarchy Process;

机译：WSNHDFS大量小文件动态队列分析层次结构过程;

相似文献

外文文献
中文文献
专利

1. Enhancing HDFS with a full-text search system for massive small files [J] . Xu Wentao, Zhao Xin, Lao Bin, Journal of supercomputing . 2021,第7期

机译：使用全文搜索系统增强HDF，用于大量小文件
2. Storage-Optimization Method for Massive Small Files of Agricultural Resources Based on Hadoop [J] . Jun Liu Journal of Advanced Computatioanl Intelligence and Intelligent Informatics . 2019,第4a138期

机译：基于Hadoop的农业资源大规模小文件的存储优化方法
3. Application of Cloud Storage Technology in the Management of Massive Digital Teaching Resources Based on HDFS [J] . Hui Cao Journal of Software Engineering . 2015,第4期

机译：基于HDFS的云存储技术在海量数字教学资源管理中的应用
4. An optimization strategy of massive small files storage based on HDFS [C] . Xun Cai, Cai Chen, Yi Liang Joint International Advanced Engineering and Technology Research Conference . 2018

机译：基于HDFS的大规模小文件存储优化策略
5. Low-storage sequential methods for data mining and the analysis of massive datasets. [D] . McDermott, James Patrick. 2003

机译：用于数据挖掘和海量数据集分析的低存储顺序方法。
6. Optimized distributed systems achieve significant performance improvement on sorted merging of massive VCF files [O] . Xiaobo Sun, Jingjing Gao, Peng Jin, 2018

机译：经过优化的分布式系统在大量VCF文件的排序合并中实现了显着的性能提升
7. An optimization strategy of massive small files storage based on HDFS [O] . Xun Cai, Cai Chen, Yi Liang 2018

机译：基于HDFS的大规模小文件存储优化策略

An optimized method of HDFS for massive small files storage

摘要

著录项

相似文献

相关主题

期刊订阅