首页> 外文期刊>Journal of Parallel and Distributed Computing >Enhancing throughput of the Hadoop Distributed File System for interaction-intensive tasks
【24h】

Enhancing throughput of the Hadoop Distributed File System for interaction-intensive tasks

机译:提高Hadoop分布式文件系统的吞吐量以处理交互密集型任务

获取原文
获取原文并翻译 | 示例

摘要

The Hadoop Distributed File System (HDFS) is designed to run on commodity hardware and can be used as a stand-alone general purpose distributed file system (Hdfs user guide, 2008). It provides the ability to access bulk data with high I/O throughput. As a result, this system is suitable for applications that have large I/O data sets. However, the performance of HDFS decreases dramatically when handling the operations of interaction-intensive files, i.e., files that have relatively small size but are frequently accessed. The paper analyzes the cause of throughput degradation issue when accessing interaction-intensive files and presents an enhanced HDFS architecture along with an associated storage allocation algorithm that overcomes the performance degradation problem. Experiments have shown that with the proposed architecture together with the associated storage allocation algorithm, the HDFS throughput for interaction-intensive files increases 300% on average with only a negligible performance decrease for large data set tasks.
机译:Hadoop分布式文件系统(HDFS)设计为在商用硬件上运行,并且可以用作独立的通用分布式文件系统(Hdfs用户指南,2008)。它提供了以高I / O吞吐量访问批量数据的功能。因此,该系统适用于具有较大I / O数据集的应用程序。但是,HDFS的性能在处理交互密集型文件(即具有相对较小但经常访问的文件)的操作时会急剧下降。本文分析了访问交互密集型文件时吞吐量降低问题的原因,并提出了一种增强的HDFS体系结构以及克服了性能降低问题的相关存储分配算法。实验表明,通过所提出的体系结构以及相关的存储分配算法,用于交互密集型文件的HDFS吞吐量平均提高了300%,而对于大型数据集任务而言,性能下降的幅度可以忽略不计。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号