首页> 外文OA文献 >Energy savings and performance improvements with SSDs in the Hadoop Distributed File System
【2h】

Energy savings and performance improvements with SSDs in the Hadoop Distributed File System

机译:在Hadoop分布式文件系统中使用SSD来节能和性能改进

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。
获取外文期刊封面目录资料

摘要

Energy issues gathered strong attention over the past decade, reaching IT data processing infrastructures. Now, they need to cope with such responsibility, adjusting existing platforms to reach acceptable performance while promoting energy consumption reduction. As the de facto platform for Big Data, Apache Hadoop has evolved significantly over the last years, with more than 60 releases bringing new features. By implementing the MapReduce programming paradigm and leveraging HDFS, its distributed file system, Hadoop has become a reliable and fault tolerant middleware for parallel and distributed computing over large datasets. Nevertheless, Hadoop may struggle under certain workloads, resulting in poor performance and high energy consumption. Users increasingly demand that high performance computing solutions address sustainability and limit energy consumption. In this thesis, we introduce HDFSH, a hybrid storage mechanism for HDFS, which uses a combination of Hard Disks and Solid-State Disks to achieve higher performance while saving power in Hadoop computations. HDFSH brings, to the middleware, the best from HDs (affordable cost per GB and high storage capacity) and SSDs (high throughput and low energy consumption) in a configurable fashion, using dedicated storage zones for each storage device type. We implemented our mechanism as a block placement policy for HDFS, and assessed it over six recent releases of Hadoop with different architectural properties. Results indicate that our approach increases overall job performance while decreasing the energy consumption under most hybrid configurations evaluated. Our results also showed that, in many cases, storing only part of the data in SSDs results in significant energy savings and execution speedups
机译:能源问题在过去十年中收集了强烈的关注,达到了IT数据处理基础架构。现在,他们需要应对此类责任,调整现有平台以促进能耗减少的同时达到可接受的性能。作为大数据的事实上平台,Apache Hadoop在过去几年中发达了显着发展,超过60次释放带来新功能。通过实现MapReduce编程范例并利用HDFS,其分布式文件系统,Hadoop已成为在大型数据集上并行和分布式计算的可靠和容错中间件。尽管如此,Hadoop可能会在某些工作量下挣扎,导致性能不佳和高能耗。用户越来越需要高性能计算解决方案地址可持续性和限制能耗。在本文中,我们介绍了HDFSH,一种用于HDFS的混合存储机制,它使用硬盘和固态磁盘的组合来实现更高的性能,同时在HADOOP计算中节省电力。 HDFSH为中间件带来了最佳的HDS(可承受的每个GB和高存储容量)和SSD(高吞吐量和低耗耗),使用每个存储设备类型的专用存储区域使用专用存储区域。我们将我们的机制实施为HDFS的块放置策略,并评估了具有不同架构属性的Hadoop最近的六个初始版本。结果表明,我们的方法增加了整体工作性能,同时降低了大多数混合配置的能耗。我们的结果还表明,在许多情况下,仅在SSD中存储部分数据导致显着的节能和执行加速

著录项

  • 作者

    Ivanilton Polato;

  • 作者单位
  • 年度 -1
  • 总页数
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号