首页> 外文会议>IEEE International Conference on Data Mining >WRS: Waiting Room Sampling for Accurate Triangle Counting in Real Graph Streams
【24h】

WRS: Waiting Room Sampling for Accurate Triangle Counting in Real Graph Streams

机译:WRS:等候室采样,用于在实图流中进行精确的三角计数

获取原文

摘要

If we cannot store all edges in a graph stream, which edges should we store to estimate the triangle count accurately?Counting triangles (i.e., cycles of length three) is a fundamental graph problem with many applications in social network analysis, web mining, anomaly detection, etc. Recently, much effort has been made to accurately estimate global and local triangle counts in streaming settings with limited space. Although existing methods use sampling techniques without considering temporal dependencies in edges, we observe temporal locality in real dynamic graphs. That is, future edges are more likely to form triangles with recent edges than with older edges. In this work, we propose a single-pass streaming algorithm called Waiting-Room Sampling (WRS) for global and local triangle counting. WRS exploits the temporal locality by always storing the most recent edges, which future edges are more likely to form triangles with, in the waiting room, while it uses reservoir sampling for the remaining edges. Our theoretical and empirical analyses show that WRS is: (a) Fast and 'any time': runs in linear time, always maintaining and updating estimates while new edges arrive, (b) Effective: yields up to 47% smaller estimation error than its best competitors, and (c) Theoretically sound: gives unbiased estimates with small variances under the temporal locality.
机译:如果不能将所有边存储在图流中,则应存储哪些边以准确估计三角形数?三角形计数(即长度为3的循环)是基本图问题,在社交网络分析,Web挖掘,异常中有许多应用最近,人们进行了很多努力来准确地估计空间有限的流设置中的全局和局部三角形计数。尽管现有方法使用采样技术时并未考虑边缘中的时间依赖性,但我们在实际动态图中观察到了时间局部性。也就是说,与较旧的边缘相比,较新的边缘更容易形成三角形。在这项工作中,我们提出了一种称为等待室采样(WRS)的单程流算法,用于全局和局部三角形计数。 WRS通过始终存储最新的边缘来利用时间局部性,而未来的边缘更可能在候诊室中与之形成三角形,而WRS则将水库采样用于​​其余的边缘。我们的理论和经验分析表明,WRS是:(a)快速且“任意时间”:以线性时间运行,在新的边沿到达时始终保持和更新估计值,(b)有效:产生的估计误差比估计误差小47% (c)从理论上讲是正确的:在时间局部性下给出无偏差的估计值,且方差很小。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号