首页> 外文会议>IEEE International Congress on Big Data >Composable and efficient functional big data processing framework
【24h】

Composable and efficient functional big data processing framework

机译:可组合和高效的功能大数据处理框架

获取原文

摘要

Over the past years, frameworks such as MapRe-duce and Spark have been introduced to ease the task of developing big data programs and applications. However, the jobs in these frameworks are roughly defined and packaged as executable jars without any functionality being exposed or described. This means that deployed jobs are not natively composable and reusable for subsequent development. Besides, it also hampers the ability for applying optimizations on the data flow of job sequences and pipelines. In this paper, we present the Hierarchically Distributed Data Matrix (HDM) which is a functional, strongly-typed data representation for writing composable big data applications. Along with HDM, a runtime framework is provided to support the execution of HDM applications on distributed infrastructures. Based on the functional data dependency graph of HDM, multiple optimizations are applied to improve the performance of executing HDM jobs. The experimental results show that our optimizations can achieve improvements of between 10% to 60% of the Job-Completion-Time for different types of operation sequences when compared with the current state of art, Apache Spark.
机译:在过去几年中,已经引入了Mapre-Duce和​​Spark等框架,以简化开发大数据计划和应用程序的任务。但是,这些框架中的作业大致定义和包装为可执行jar,而无需曝光或描述任何功能。这意味着部署的作业并不是本身的可组合和可重复使用的后续开发。此外,它还妨碍了在作业序列和管道数据流上应用优化的能力。在本文中,我们介绍了分层分布的数据矩阵(HDM),它是用于编写可组合的大数据应用的功能,强类型的数据表示。随着HDM,提供了一个运行时框架,以支持在分布式基础架构上执行HDM应用程序。基于HDM的功能数据依赖关系图,应用了多个优化来提高执行HDM作业的性能。实验结果表明,与当前的艺术状态相比,我们的优化可以实现不同类型的操作序列的工作完成时间的10%至60%的改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号