Composable and efficient functional big data processing framework

机译：可组合且高效的功能大数据处理框架

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Over the past years, frameworks such as MapRe-duce and Spark have been introduced to ease the task of developing big data programs and applications. However, the jobs in these frameworks are roughly defined and packaged as executable jars without any functionality being exposed or described. This means that deployed jobs are not natively composable and reusable for subsequent development. Besides, it also hampers the ability for applying optimizations on the data flow of job sequences and pipelines. In this paper, we present the Hierarchically Distributed Data Matrix (HDM) which is a functional, strongly-typed data representation for writing composable big data applications. Along with HDM, a runtime framework is provided to support the execution of HDM applications on distributed infrastructures. Based on the functional data dependency graph of HDM, multiple optimizations are applied to improve the performance of executing HDM jobs. The experimental results show that our optimizations can achieve improvements of between 10% to 60% of the Job-Completion-Time for different types of operation sequences when compared with the current state of art, Apache Spark.

机译：在过去的几年中，引入了诸如MapRe-duce和Spark之类的框架来简化开发大数据程序和应用程序的任务。但是，这些框架中的作业被粗略地定义并打包为可执行jar，而没有公开或描述任何功能。这意味着已部署的作业本机不可组合并且不可用于后续开发。此外，它还会妨碍对作业序列和流水线的数据流进行优化的能力。在本文中，我们介绍了分层分布式数据矩阵（HDM），它是一种功能强大的强数据表示形式，用于编写可组合的大数据应用程序。与HDM一起，提供了运行时框架以支持在分布式基础结构上执行HDM应用程序。基于HDM的功能数据依赖关系图，可以应用多项优化来提高执行HDM作业的性能。实验结果表明，与当前最先进的Apache Spark相比，针对不同类型的操作序列，我们的优化可以将作业完成时间提高10％至60％。

著录项

来源
《IEEE International Congress on Big Data》|2015年|279-286|共8页
会议地点
作者
Wu Dongyao; Sakr Sherif; Zhu Liming; Lu Qinghua;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
big data processing; distributed systems; functional programming; parallel programming; system architecture;

机译：大数据处理分布式系统功能编程并行编程系统架构;

相似文献

外文文献
中文文献
专利

1. A hybrid material composed of an amino-functionalized zirconium-based metal-organic framework and a urea-based porous organic polymer as an efficient sorbent for extraction of uranium(VI) [J] . Fotovat Haniyeh, Khajeh Mostafa, Oveisi Ali Reza, Mikrochimica Acta: An International Journal for Physical and Chemical Methods of Analysis . 2018,第10期

机译：由氨基官能化的基于锆的金属 - 有机骨架和尿素基多孔有机聚合物组成的杂化材料，作为铀（VI）提取的有效吸附剂
2. Enabling Efficient Late-Stage Functionalization of Drug-Like Molecules with LC-MS and Reaction-Driven Data Processing [J] . Huifang Yao, Yong Liu, Sriram Tyagarajan, European journal of organic chemistry . 2017,第47期

机译：通过LC-MS和反应驱动的数据处理实现药物样分子的高效延期官能化
3. An Efficient Data Analysis Framework for Online Security Processing [J] . Jun Li, Yanzhao Liu Journal of computer networks and communications . 2021,第a期

机译：用于在线安全处理的有效数据分析框架
4. Composable and efficient functional big data processing framework [C] . Wu Dongyao, Sakr Sherif, Zhu Liming, IEEE International Congress on Big Data . 2015

机译：可组合和高效的功能大数据处理框架
5. A Differentially-Private and Efficient Framework for Collecting and Processing Network Flow Data [D] . Niculaescu, Oana-Georgiana . 2019

机译：用于收集和处理网络流数据的差异私有和有效的框架
6. Computing Functional Brain Connectivity in Neurological Disorders: Efficient Processing and Retrieval of Electrophysiological Signal Data [O] . Arthur Gershon, Pramith Devulapalli1, Bilal Zonjy, 2019

机译：计算神经系统疾病中的功能性大脑连通性：电生理信号数据的有效处理和检索
7. Efficient and Customizable Data Partitioning Framework for Distributed Big RDF Data Processing in the Cloud [O] . Kisung Lee, Ling Liu, Yuzhe Tang, 2015

机译：云中分布式大型RDF数据处理的高效可定制数据分区框架

Composable and efficient functional big data processing framework

摘要

著录项

相似文献

相关主题

期刊订阅