首页> 外文会议>IEEE International Symposium on Parallel Distributed Processing >BlobSeer: Bringing high throughput under heavy concurrency to Hadoop Map-Reduce applications
【24h】

BlobSeer: Bringing high throughput under heavy concurrency to Hadoop Map-Reduce applications

机译:Blobseer:将繁重的吞吐量带到Hadoop地图 - 减少应用程序下

获取原文

摘要

Hadoop is a software framework supporting the Map-Reduce programming model. It relies on the Hadoop Distributed File System (HDFS) as its primary storage system. The efficiency of HDFS is crucial for the performance of Map-Reduce applications. We substitute the original HDFS layer of Hadoop with a new, concurrency-optimized data storage layer based on the BlobSeer data management service. Thereby, the efficiency of Hadoop is significantly improved for data-intensive Map-Reduce applications, which naturally exhibit a high degree of data access concurrency. Moreover, BlobSeer's features (built-in versioning, its support for concurrent append operations) open the possibility for Hadoop to further extend its functionalities. We report on extensive experiments conducted on the Grid'5000 testbed. The results illustrate the benefits of our approach over the original HDFS-based implementation of Hadoop.
机译:Hadoop是一种支持地图减少编程模型的软件框架。 它依赖于Hadoop分布式文件系统(HDFS)作为其主存储系统。 HDFS的效率对于Map-Deally应用程序的性能至关重要。 我们使用新的并发优化的数据存储层替换Hadoop的原始HDFS层,基于Blobse数据管理服务。 因此,对于数据密集型地图减少应用,Hadoop的效率显着提高,这自然地表现出高度的数据访问并发性。 此外,Blobseer的功能(内置版本控制,它对并发追加操作的支持)打开Hadoop的可能性,以进一步扩展其功能。 我们报告了在网格5000试验台上进行的广泛实验。 结果说明了我们对Hadoop的原始HDFS实施的方法的好处。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号