首页> 外文会议>International Conference series on Parallel Computing >Static and Dynamic Big Data Partitioning on Apache Spark
【24h】

Static and Dynamic Big Data Partitioning on Apache Spark

机译:Apache Spark上的静态和动态大数据划分

获取原文

摘要

Many of today's large datasets are organized as a graph. Due to their size it is often infeasible to process these graphs using a single machine. Therefore, many software frameworks and tools have been proposed to process graph on top of distributed infrastructures. This software is often bundled with generic data decomposition strategies that are not optimised for specific algorithms. In this paper we study how a specific data partitioning strategy affects the performances of graph algorithms executing on Apache Spark. To this end, we implemented different graph algorithms and we compared their performances using a naive partitioning solution against more elaborate strategies, both static and dynamic.
机译:今天许多大型数据集团作为图形组织。由于它们的大小,使用单个机器处理这些图形通常是不可行的。因此,已经提出了许多软件框架和工具来处理分布式基础架构顶部的图表。该软件通常与通用数据分解策略捆绑在于未针对特定算法进行优化。在本文中,我们研究特定数据分区策略如何影响在Apache Spark上执行的图形算法的性能。为此,我们实现了不同的图形算法,我们使用天真的分区解决方案对其进行比较更精细的策略,既静态和动态。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号