...
首页> 外文期刊>Cloud Computing, IEEE Transactions on >Deploying Large-Scale Datasets on-Demand in the Cloud: Treats and Tricks on Data Distribution
【24h】

Deploying Large-Scale Datasets on-Demand in the Cloud: Treats and Tricks on Data Distribution

机译:在云中按需部署大规模数据集:数据分发方面的技巧和窍门

获取原文
获取原文并翻译 | 示例
           

摘要

Public clouds have democratised the access to analytics for virtually any institution in the world. Virtual machines (VMs) can be provisioned on demand to crunch data after uploading into the VMs. While this task is trivial for a few tens of VMs, it becomes increasingly complex and time consuming when the scale grows to hundreds or thousands of VMs crunching tens or hundreds of TB. Moreover, the elapsed time comes at a price: the cost of provisioning VMs in the cloud and keeping them waiting to load the data. In this paper we present a big data provisioning service that incorporates hierarchical and peer-to-peer data distribution techniques to speed-up data loading into the VMs used for data processing. The system dynamically mutates the sources of the data for the VMs to speed-up data loading. We tested this solution with 1000 VMs and 100 TB of data, reducing time by at least 30 percent over current state of the art techniques. This dynamic topology mechanism is tightly coupled with classic declarative machine configuration techniques (the system takes a single high-level declarative configuration file and configures both software and data loading). Together, these two techniques simplify the deployment of big data in the cloud for end users who may not be experts in infrastructure management.
机译:公共云使世界上几乎所有机构对分析的访问民主化。可以按需配置虚拟机(VM),以便在将数据上传到VM之后对数据进行处理。尽管对于数十个VM来说,这项任务是微不足道的,但当规模增长到数百或数千个VM压缩数十或数百TB时,它变得越来越复杂且耗时。此外,耗费的时间是有代价的:在云中配置VM并使其等待加载数据的成本。在本文中,我们提供了一种大数据供应服务,该服务结合了分层和对等数据分发技术,可加快数据加载到用于数据处理的VM中的速度。系统会动态更改VM的数据源,以加快数据加载速度。我们用1000个VM和100 TB的数据测试了该解决方案,与当前的最新技术相比,该方法可将时间减少至少30%。这种动态拓扑机制与经典的声明性机器配置技术紧密结合(系统采用单个高级声明性配置文件,并配置软件和数据加载)。这两种技术在一起可以为可能不是基础架构管理专家的最终用户简化云中大数据的部署。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号