首页> 美国卫生研究院文献>other >Optimizing Interactive Development of Data-Intensive Applications
【2h】

Optimizing Interactive Development of Data-Intensive Applications

机译:优化数据密集型应用程序的交互式开发

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Modern Data-Intensive Scalable Computing (DISC) systems are designed to process data through batch jobs that execute programs (e.g., queries) compiled from a high-level language. These programs are often developed interactively by posing ad-hoc queries over the base data until a desired result is generated. We observe that there can be significant overlap in the structure of these queries used to derive the final program. Yet, each successive execution of a slightly modified query is performed anew, which can significantly increase the development cycle. Vega is an Apache Spark framework that we have implemented for optimizing a series of similar Spark programs, likely originating from a development or exploratory data analysis session. Spark developers (e.g., data scientists) can leverage Vega to significantly reduce the amount of time it takes to re-execute a modified Spark program, reducing the overall time to market for their Big Data applications.
机译:现代数据密集型可伸缩计算(DISC)系统旨在通过批处理作业来处理数据,这些批处理作业执行从高级语言编译的程序(例如查询)。这些程序通常是通过对基础数据进行临时查询直到生成所需结果来交互式开发的。我们观察到,用于得出最终程序的这些查询的结构可能存在重大重叠。但是,重新执行每次稍加修改的查询都会重新执行,这可能会大大增加开发周期。 Vega是我们已实现的Apache Spark框架,用于优化一系列类似的Spark程序,这些程序可能源自开发或探索性数据分析会话。 Spark开发人员(例如,数据科学家)可以利用Vega来大大减少重新执行经过修改的Spark程序所需的时间,从而减少其大数据应用程序的总体上市时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号