首页> 外文期刊>Services Computing, IEEE Transactions on >Enumerating Maximal Bicliques from a Large Graph Using MapReduce
【24h】

Enumerating Maximal Bicliques from a Large Graph Using MapReduce

机译:使用MapReduce从大图中枚举最大Bicliques

获取原文
获取原文并翻译 | 示例
       

摘要

We consider the enumeration of maximal bipartite cliques (bicliques) from a large graph, a task central to many data mining problems arising in social network analysis and bioinformatics. We present novel parallel algorithms for the MapReduce framework, and an experimental evaluation using Hadoop MapReduce. Our algorithm is based on clustering the input graph into smaller subgraphs, followed by processing different subgraphs in parallel. Our algorithm uses two ideas that enable it to scale to large graphs: (1) the redundancy in work between different subgraph explorations is minimized through a careful pruning of the search space, and (2) the load on different reducers is balanced through a task assignment that is based on an appropriate total order among the vertices. We show theoretically that our algorithm is work optimal, i.e., it performs the same total work as its sequential counterpart. We present a detailed evaluation which shows that the algorithm scales to large graphs with millions of edges and tens of millions of maximal bicliques. To our knowledge, this is the first work on maximal biclique enumeration for graphs of this scale.
机译:我们考虑从一个较大的图中枚举最大的二方集团(bicliques),这是社交网络分析和生物信息学中出现的许多数据挖掘问题的核心任务。我们为MapReduce框架提供新颖的并行算法,并使用Hadoop MapReduce进行实验评估。我们的算法基于将输入图聚类为较小的子图,然后并行处理不同的子图。我们的算法使用两个想法使其能够缩放到大图:(1)通过精心修剪搜索空间来最小化不同子图探索之间的工作冗余;(2)通过任务来平衡不同化归器上的负载基于顶点之间适当的总顺序的分配。从理论上讲,我们证明了我们的算法是最优的工作,即它执行的总工作量与顺序对等算法相同。我们提供了一个详细的评估,该评估表明该算法可缩放到具有数百万个边和数千万个最大双斜率的大型图形。据我们所知,这是该比例尺图的最大双斜率枚举的第一项工作。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号