首页> 外文会议>IEEE International Conference on Data Engineering >Too Big to Eat: Boosting Analytics Data Ingestion from Object Stores with Scoop
【24h】

Too Big to Eat: Boosting Analytics Data Ingestion from Object Stores with Scoop

机译:实在太大了:使用Scoop促进从对象存储中提取分析数据

获取原文

摘要

Extracting value from data stored in object stores,such as OpenStack Swift and Amazon S3, can be problematicin common scenarios where analytics frameworks and objectstores run in physically disaggregated clusters. One of the mainproblems is that analytics frameworks must ingest large amountsof data from the object store prior to the actual computation;this incurs a significant resources and performance overhead. Toovercome this problem, we present Scoop. Scoop enables analyticsframeworks to benefit from the computational resources of objectstores to optimize the execution of analytics jobs. Scoop achievesthis by enabling the addition of ETL-type actions to the dataupload path and by offloading querying functions to the objectstore through a rich and extensible active object storage layer. Asa proof-of-concept, Scoop enables Apache Spark SQL selectionsand projections to be executed close to the data in OpenStackSwift for accelerating analytics workloads of a smart energy gridcompany (GridPocket). Our experiments in a 63-machine clusterwith real IoT data and SQL queries from GridPocket show thatScoop exhibits query execution times up to 30x faster than thetraditional “ingest-then-compute” approach.
机译:从存储在对象存储(例如OpenStack Swift和Amazon S3)中的数据中提取价值,在分析框架和对象存储在物理上分散的集群中运行的常见场景中可能会出现问题。主要问题之一是分析框架必须在实际计算之前从对象存储中提取大量数据;这会招致大量资源和性能开销。为了克服这个问题,我们介绍了Scoop。 Scoop使analyticsframeworks可以从对象库的计算资源中受益,以优化分析作业的执行。 Scoop通过在数据上传路径中添加ETL类型的操作,以及通过丰富且可扩展的活动对象存储层将查询功能转移到对象存储中,从而实现了这一目标。作为概念验证,Scoop使Apache Spark SQL选择和预测可以在OpenStackSwift中的数据附近执行,以加速智能能源网格公司(GridPocket)的分析工作量。我们在63台计算机集群中进行的实验(具有真实的IoT数据和GridPocket的SQL查询)显示,Scoop的查询执行时间比传统的“先入网后计算”方法快30倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号