Composable and efficient functional big data processing framework

机译：可组合和高效的功能大数据处理框架

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Over the past years, frameworks such as MapRe-duce and Spark have been introduced to ease the task of developing big data programs and applications. However, the jobs in these frameworks are roughly defined and packaged as executable jars without any functionality being exposed or described. This means that deployed jobs are not natively composable and reusable for subsequent development. Besides, it also hampers the ability for applying optimizations on the data flow of job sequences and pipelines. In this paper, we present the Hierarchically Distributed Data Matrix (HDM) which is a functional, strongly-typed data representation for writing composable big data applications. Along with HDM, a runtime framework is provided to support the execution of HDM applications on distributed infrastructures. Based on the functional data dependency graph of HDM, multiple optimizations are applied to improve the performance of executing HDM jobs. The experimental results show that our optimizations can achieve improvements of between 10% to 60% of the Job-Completion-Time for different types of operation sequences when compared with the current state of art, Apache Spark.

机译：在过去几年中，已经引入了Mapre-Duce和Spark等框架，以简化开发大数据计划和应用程序的任务。但是，这些框架中的作业大致定义和包装为可执行jar，而无需曝光或描述任何功能。这意味着部署的作业并不是本身的可组合和可重复使用的后续开发。此外，它还妨碍了在作业序列和管道数据流上应用优化的能力。在本文中，我们介绍了分层分布的数据矩阵（HDM），它是用于编写可组合的大数据应用的功能，强类型的数据表示。随着HDM，提供了一个运行时框架，以支持在分布式基础架构上执行HDM应用程序。基于HDM的功能数据依赖关系图，应用了多个优化来提高执行HDM作业的性能。实验结果表明，与当前的艺术状态相比，我们的优化可以实现不同类型的操作序列的工作完成时间的10％至60％的改进。

著录项

来源
《IEEE International Congress on Big Data》|2015年||共8页
会议地点
作者
Wu Dongyao; Sakr Sherif; Zhu Liming; Lu Qinghua;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词
big data processing; distributed systems; functional programming; parallel programming; system architecture;

机译：大数据处理;分布式系统;功能规划;并行编程;系统架构;

相似文献

外文文献
中文文献
专利

1. A hybrid material composed of an amino-functionalized zirconium-based metal-organic framework and a urea-based porous organic polymer as an efficient sorbent for extraction of uranium(VI) [J] . Fotovat Haniyeh, Khajeh Mostafa, Oveisi Ali Reza, Mikrochimica Acta: An International Journal for Physical and Chemical Methods of Analysis . 2018,第10期

机译：由氨基官能化的基于锆的金属 - 有机骨架和尿素基多孔有机聚合物组成的杂化材料，作为铀（VI）提取的有效吸附剂
2. Enabling Efficient Late-Stage Functionalization of Drug-Like Molecules with LC-MS and Reaction-Driven Data Processing [J] . Huifang Yao, Yong Liu, Sriram Tyagarajan, European journal of organic chemistry . 2017,第47期

机译：通过LC-MS和反应驱动的数据处理实现药物样分子的高效延期官能化
3. An Efficient Data Analysis Framework for Online Security Processing [J] . Jun Li, Yanzhao Liu Journal of computer networks and communications . 2021,第a期

机译：用于在线安全处理的有效数据分析框架
4. Composable and efficient functional big data processing framework [C] . Wu Dongyao, Sakr Sherif, Zhu Liming, IEEE International Congress on Big Data . 2015

机译：可组合且高效的功能大数据处理框架
5. A Differentially-Private and Efficient Framework for Collecting and Processing Network Flow Data [D] . Niculaescu, Oana-Georgiana . 2019

机译：用于收集和处理网络流数据的差异私有和有效的框架
6. Computing Functional Brain Connectivity in Neurological Disorders: Efficient Processing and Retrieval of Electrophysiological Signal Data [O] . Arthur Gershon, Pramith Devulapalli1, Bilal Zonjy, 2019

机译：计算神经系统疾病中的功能性大脑连通性：电生理信号数据的有效处理和检索
7. Efficient and Customizable Data Partitioning Framework for Distributed Big RDF Data Processing in the Cloud [O] . Kisung Lee, Ling Liu, Yuzhe Tang, 2015

机译：云中分布式大型RDF数据处理的高效可定制数据分区框架

Composable and efficient functional big data processing framework

摘要

著录项

相似文献

相关主题

期刊订阅