首页> 中文期刊> 《应用科技》 >基于 MapReduce 的大规模图挖掘并行计算模型

基于 MapReduce 的大规模图挖掘并行计算模型

     

摘要

在如何快速发现大规模网络的结构和特性问题中,网络规模及复杂度的快速增长给其分析研究带来了新的挑战. MapReduce 及其开源实现 Hadoop 给大规模图的高效处理带来了希望.基于 MapReduce 框架的集群系统,提出了1种新的计算模型用于大规模图形的3-clique 计算,来实现图挖掘.计算的基本步骤是:首先获取每个节点的第1跳信息,然后是第2跳信息,最后得到所有基于该节点的3-clique.该计算模型可以用来计算聚集系数,并且可以用于三大通话网络的挖掘.实验结果证明这种计算模型具有良好的可扩展性和性能.%  Large-scale graphs exist everywhere. The continued exponential growth in both the size and complexity of the graphs is posing a new challenge for finding the structures and characters of a large-scale graph. An excellent promising clue for dealing with graphs with great sizes is the emerging MapReduce framework and its open-source implementation, Hadoop. The problem of 3-clique enumeration of a graph is an important operation that can help structure mining and a difficult mission for graphs with great sizes on the single computer. In this paper, we propose a parallel computing model for 3-clique enumeration based on cluster system with the help of MapReduce for large-scale graphs. The process of enumeration is firstly to extract one-leap information of the graph, then the two-leap information and finally, the key-based 3-clique enumeration. Also, we apply the computing model to the computation of clustering coefficient. The computing model is applied to three real-world large CALL graphs and the results of the experiments manifest the good scalability and efficiency of the model.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号