An approach based on SequenceFile is proposed to improve storage efficiency of small files in the cloud storage systems that are on the basis of Hadoop distributed file system(HDFS).The approach uses the multi-attribute decision theory and the indices such as reading time, combining time, and saved memory size to obtain an optimal file merging scheme, so that the balance between computing time and memory space is achieved. A system load forecast algorithm is designed based on the analytic hierarchy process to predict the load of the system. SequenceFite is used to combine small files. Experimental results show that, without degrading the performance of storage system, the storage efficiency of small files is improved.%针对基于HDFS(Hadoop distribated file system)的云存储系统中小文件存储效率不高的问题,采用序列文件技术设计了一个云存储系统中小文件的处理方案.该方案利用多维属性决策理论,综合读文件时间、合并文件时间及节省内存空间大小等指标,得出合并小文件的最优方式,能够在消耗的时间和节省的内存空间之间取得平衡;设计基于层次分析法的系统负载预测算法对系统负载进行预测,从而实现负载均衡的目的;利用序列文件技术对小文件进行合并.实验结果表明,在不影响存储系统运行状况的基础上,该方案提高了小文件的存储效率.
展开▼