首页> 外文期刊>Concurrency, practice and experience >JS4Cloud: script-based workflow programming for scalable data analysis on cloud platforms
【24h】

JS4Cloud: script-based workflow programming for scalable data analysis on cloud platforms

机译:JS4Cloud:基于脚本的工作流编程,可在云平台上进行可扩展的数据分析

获取原文
获取原文并翻译 | 示例

摘要

Workflows are an effective paradigm to model complex data analysis processes, such as knowledgerndiscovery in databases applications, which can be efficiently executed on distributed computing systemsrnsuch as a Cloud platform. Data analysis workflows can be designed through visual programming, whichrnis a convenient design approach for high-level users. On the other hand, script-based workflows are a usefulrnalternative to visual workflows, because they allow expert users to program complex applications morerneffectively. In order to provide Cloud users with an effective script-based data analysis workflow formalism,rnwe designed the JS4Cloud language. The main benefits of JS4Cloud are as follows: (i) it extends thernwell-known JavaScript language while using only its basic functions (arrays, functions, and loops); (ii) itrnimplements both a data-driven task parallelism that automatically spawns ready-to-run tasks to the Cloudrnresources and data parallelism through an array-based formalism; and (iii) these two types of parallelism arernexploited implicitly so that workflows can be programmed in a fully sequential way, which frees users fromrnduties like work partitioning, synchronization, and communication. We describe how JS4Cloud has beenrnintegrated within the data mining cloud framework (DMCF), a system supporting the scalable execution ofrndata analysis workflows on Cloud platforms. In particular, we describe how data analysis workflows modeledrnas JS4Cloud scripts are processed by DMCF by exploiting parallelism to enable their scalable executionrnon Clouds. Finally, we present some data analysis workflows developed with JS4Cloud and the performancernresults obtained by executing such workflows on DMCF.
机译:工作流是对复杂数据分析过程(例如数据库应用程序中的知识发现)建模的有效范例,可以在分布式计算系统(例如云平台)上有效地执行该工作流。可以通过可视化程序设计数据分析工作流,这是高级用户的便捷设计方法。另一方面,基于脚本的工作流是可视化工作流的替代方法,因为它们使专家用户可以更有效地对复杂的应用程序进行编程。为了向Cloud用户提供有效的基于脚本的数据分析工作流形式化,我们设计了JS4Cloud语言。 JS4Cloud的主要优点如下:(i)在仅使用其基本功能(数组,函数和循环)的同时扩展了众所周知的JavaScript语言; (ii)实现基于数据的任务并行性和通过基于数组的形式主义自动生成可立即运行的任务到Cloudrnresources的数据并行性; (iii)隐式地利用了这两种类型的并行性,以便可以以完全顺序的方式对工作流进行编程,从而使用户摆脱了工作分配,同步和通信等繁琐的工作。我们描述了如何将JS4Cloud集成到数据挖掘云框架(DMCF)中,该系统支持在Cloud平台上可伸缩地执行rnana数据分析工作流。特别是,我们描述了DMCF如何通过利用并行性来启用其可扩展执行能力,从而对DM4 JS4Cloud脚本建模的数据分析工作流进行处理。最后,我们介绍了使用JS4Cloud开发的一些数据分析工作流,以及通过在DMCF上执行此类工作流而获得的性能结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号