首页> 外文会议>ACMKDD International Conference on Knowledge Discovery and Data Mining;KDD 2008 >Efficient Semi-streaming Algorithms for Local Triangle Counting in Massive Graphs
【24h】

Efficient Semi-streaming Algorithms for Local Triangle Counting in Massive Graphs

机译:大规模图中局部三角形计数的高效半流算法

获取原文

摘要

In this paper we study the problem of local triangle counting in large graphs. Namely, given a large graph G = (V, E) we want to estimate as accurately as possible the number of triangles incident to every node v 6 V in the graph. The problem of computing the global number of triangles in a graph has been considered before, but to our knowledge this is the first paper that addresses the problem of local triangle counting with a focus on the efficiency issues arising in massive graphs. The distribution of the local number of triangles and the related local clustering coefficient can be used in many interesting applications. For example, we show that the measures we compute can help to detect the presence of spamming activity in large-scale Web graphs, as well as to provide useful features to assess content quality in social networks.For computing the local number of triangles we propose two approximation algorithms, which are based on the idea of min-wise independent permutations (Broder et al. 1998). Our algorithms operate in a semi-streaming fashion, using O(|V|) space in main memory and performing O(log|V|) sequential scans over the edges of the graph. The first algorithm we describe in this paper also uses O(|E|) space in external memory during computation, while the second algorithm uses only main memory. We present the theoretical analysis as well as experimental results in massive graphs demonstrating the practical efficiency of our approach.
机译:在本文中,我们研究了在大型图中进行局部三角形计数的问题。即,给定大图G =(V,E),我们希望尽可能准确地估计入射到图中每个节点v 6 V的三角形的数量。以前曾考虑过计算图中三角形的整体数目的问题,但是据我们所知,这是第一篇解决局部三角形计数问题的论文,重点关注大规模图中出现的效率问题。三角形的局部数量的分布和相关的局部聚类系数可用于许多有趣的应用程序中。例如,我们表明,我们计算出的度量值可以帮助检测大型Web图中垃圾邮件活动的存在,并提供有用的功能来评估社交网络中的内容质量。 为了计算三角形的局部数量,我们提出了两种近似算法,它们基于最小独立排列的思想(Broder等,1998)。我们的算法以半流方式运行,使用主内存中的O(| V |)空间并在图形的边缘执行O(log | V |)顺序扫描。我们在本文中描述的第一种算法在计算过程中也使用了外部存储器中的O(| E |)空间,而第二种算法仅使用了主存储器。我们在大量图表中展示了理论分析和实验结果,展示了我们方法的实际效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号