首页> 外文会议>International Conference on Distributed Computing Systems Workshops >Dynamic Random Access for Hadoop Distributed File System
【24h】

Dynamic Random Access for Hadoop Distributed File System

机译:Hadoop分布式文件系统的动态随机访问

获取原文

摘要

Recently, Hadoop Distributed File System (HDFS) has been widely used to manage the large-scale data due to its high scalability. HDFS can natively support sequential queries, which are the most common queries in the applications. However, there still exist many applications that need to apply random queries of large-scale data. So the random queries in large-scale data are becoming more and more important. Unfortunately, the HDFS is not optimized for random reads, hence there are many disadvantages in random access to HDFS. In this paper, we present three methods to solve these issues, which can optimize the random accesses to HDFS and guarantee the sequential access performance at the same time. The methods are as follows: 1) proposing dynamic methods to set the size of data packet in transmission, 2) reusing the TCP connections in localized random accesses, 3) transferring the random accesses to the same server to make full use of the TCP connections. Experimental evaluations based on real world data show that our works are effective and our solutions efficiently support sequential access and random access compared to the original methods.
机译:最近,Hadoop分布式文件系统(HDFS)已广泛用于由于其高可扩展性而管理大规模数据。 HDFS可以自然地支持顺序查询,这是应用程序中最常见的查询。但是,仍然存在许多需要应用大规模数据的随机查询的许多应用程序。因此,大规模数据中的随机查询变得越来越重要。不幸的是,HDFS未针对随机读取优化,因此对HDF的随机访问有许多缺点。在本文中,我们提出了三种解决这些问题的方法,可以优化对HDFS的随机访问,并同时保证顺序访问性能。该方法如下:1)提出动态方法设置传输中的数据包大小,2)在本地化随机访问中重用TCP连接,3)将随机访问传输到同一服务器以充分利用TCP连接。基于现实世界数据的实验评估表明,我们的作品是有效的,我们的解决方案有效地支持与原始方法相比的顺序访问和随机访问。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号