首页> 外文学位 >Low-cost Data Analytics for Shared Storage and Network Infrastructures.
【24h】

Low-cost Data Analytics for Shared Storage and Network Infrastructures.

机译:用于共享存储和网络基础架构的低成本数据分析。

获取原文
获取原文并翻译 | 示例

摘要

Data analytics used to depend on specialized, high-end software and hardware platforms. Recent years, however, have brought forth the data-flow programming model, i.e., MapReduce, and with it a flurry of sturdy, scalable open-source software solutions for analyzing data. In essence, the commoditization of software frameworks for data analytics is well underway.;Yet, up to this point, data analytics frameworks are still regarded as standalone, dedicated components; deploying these frameworks requires companies to purchase hardware to meet storage and network resource demands, and system administrators to handle management of data across multiple storage systems.;This dissertation explores the low-cost integration of frameworks for data analytics within existing, shared infrastructures. The thesis centers on smart software being the key enabler for holistic commoditization of data analytics. We focus on two instances of smart software that aid in realizing the low-cost integration objective. For an efficient storage integration, we build MixApart, a scalable data analytics framework that removes the dependency on dedicated storage for analytics; with MixApart, a single, consolidated storage back-end manages data and services all types of workloads, thereby lowering hardware costs and simplifying data management. We evaluate MixApart at scale with micro-benchmarks and production workload traces, and show that MixApart provides faster or comparable performance to an analytics framework with dedicated storage. For an effective sharing of the networking infrastructure, we implement OX, a virtual machine management framework that allows latency-sensitive web applications to share the data center network with data analytics through intelligent VM placement ; OX further protects all applications from hardware failures. The two solutions allow the reuse of existing storage and networking infrastructures when deploying analytics frameworks, and substantiate our thesis that smart software upgrades can enable the end-to-end commoditization of analytics.
机译:过去,数据分析依赖于专门的高端软件和硬件平台。然而,近年来,已经提出了数据流编程模型,即MapReduce,并随之提供了一系列用于分析数据的坚固,可扩展的开源软件解决方案。从本质上讲,用于数据分析的软件框架已经处于商品化阶段。然而,到目前为止,数据分析框架仍被视为独立的专用组件;部署这些框架需要公司购买硬件来满足存储和网络资源需求,并且需要系统管理员来处理跨多个存储系统的数据管理。本论文探讨了现有共享基础架构中数据分析框架的低成本集成。论文的重点是智能软件,它是数据分析整体商品化的关键推动力。我们专注于有助于实现低成本集成目标的两个智能软件实例。为了实现有效的存储集成,我们构建了可扩展的数据分析框架MixApart,该框架消除了对专用存储进行分析的依赖;借助MixApart,单个整合的存储后端可以管理数据并为所有类型的工作负载提供服务,从而降低硬件成本并简化数据管理。我们通过微基准和生产工作负载跟踪对MixApart进行了大规模评估,并显示MixApart可为具有专用存储的分析框架提供更快或更可比的性能。为了有效共享网络基础架构,我们实施了OX,这是一个虚拟机管理框架,该框架允许对延迟敏感的Web应用程序通过智能VM放置与数据分析共享数据中心网络。 OX进一步保护所有应用程序免受硬件故障的影响。这两种解决方案允许在部署分析框架时重用现有的存储和网络基础架构,并证实我们的论点,即智能软件升级可以实现分析的端到端商品化。

著录项

  • 作者

    Mihailescu, Madalin.;

  • 作者单位

    University of Toronto (Canada).;

  • 授予单位 University of Toronto (Canada).;
  • 学科 Computer science.
  • 学位 Ph.D.
  • 年度 2013
  • 页码 118 p.
  • 总页数 118
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号