首页> 外文期刊>Frontiers of computer science >An efficient parallel algorithm of N-hop neighborhoods on graphs in distributed environment
【24h】

An efficient parallel algorithm of N-hop neighborhoods on graphs in distributed environment

机译:分布式环境下图中N-Hop邻域的高效并行算法

获取原文
获取原文并翻译 | 示例
           

摘要

N-hop neighborhoods information is very useful in analytic tasks on large-scale graphs, like finding clique in a social network, recommending friends or advertising links according to one's interests, predicting links among websites and etc. To get the N-hop neighborhoods information on a large graph, such as a web graph, a twitter social graph, the most straightforward method is to conduct a breadth first search (BFS) on a parallel distributed graph processing framework, such as Pregel and GraphLab. However, due to the massive volume of message transfer, the BFS method results in high communication cost and has low efficiency.In this work, we propose a key/value based method, namely KVB, which perfectly fits into the prevailing parallel graph processing framework and computes N-hop neighborhoods on a large scale graph efficiently. Unlike the BFS method, our method need not transfer large amount of neighborhoods information, thus, significantly reduces the overhead on both the communication and intermediate results in the distributed framework.We formalize the N-hop neighborhoods query processing as an optimization problem based on a set of quantitative cost metrics of parallel graph processing. Moreover, we propose a solution to efficiently load only the relevant neighborhoods for computation. Specially, we prove the optimal partial neighborhoods load problem is NP-hard and carefully design a heuristic strategy. We have implemented our algorithm on a distributed graph framework- Spark GraphX and validated our solution with extensive experiments over a number of real world and synthetic large graphs on a modest indoor cluster. Experiments show that our solution generally gains an order of magnitude speedup comparing to the state-of-art BFS implementation.
机译:N-Hop邻域信息在大规模图中的分析任务中非常有用,如在社交网络中查找集团,根据一个人的兴趣,推荐朋友或广告链接,预测网站等的链接,以获取N-Hop邻居信息在大图中,例如Web图,Twitter社交图中,最简单的方法是在并行分布图形处理框架上进行广度第一搜索(BFS),例如Pregel和Graphlab。但是,由于大量的消息传输量,BFS方法导致高通信成本并具有低效率。在此工作中,我们提出了一种基于键/值的方法,即KVB,它完全适合普遍的并行图形处理框架并有效地计算大规模图中的N-HOP邻域。与BFS方法不同,我们的方法不需要传输大量的邻域信息,因此,在分布式框架中显着降低了通信和中间结果的开销。我们将N-Hop邻域查询处理形式为基于A的优化问题。并行图处理的定量成本度量集。此外,我们提出了一种解决方案,以有效地仅加载相关街区以进行计算。特别是,我们证明了最佳的部分邻域负载问题是NP - 硬,并仔细设计启发式策略。我们在分布式图形框架 - Spark Graphx上实现了我们的算法,并通过了在适度的室内集群上的许多现实世界和合成大图中进行了广泛的实验,验证了我们的解决方案。实验表明,与最先进的BFS实现相比,我们的解决方案一般增加了一系列加速顺序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号