首页> 外文期刊>ACM transactions on knowledge discovery from data >CoCoS: Fast and Accurate Distributed Triangle Counting in Graph Streams
【24h】

CoCoS: Fast and Accurate Distributed Triangle Counting in Graph Streams

机译:Cocos:在图形流中计算快速准确的分布式三角形

获取原文
获取原文并翻译 | 示例
           

摘要

Given a graph stream, how can we estimate the number of triangles in it using multiple machines with limited storage? Specifically, how should edges be processed and sampled across the machines for rapid and accurate estimation?The count of triangles (i.e., cliques of size three) has proven useful in numerous applications, including anomaly detection, community detection, and link recommendation. For triangle counting in large and dynamic graphs, recent work has focused largely on streaming algorithms and distributed algorithms but little on their combinations for "the best of both worlds."In this work, we propose CoCoS, a fast and accurate distributed streaming algorithm for estimating the counts of global triangles (i.e., all triangles) and local triangles incident to each node. Making one pass over the input stream, CoCoS carefully processes and stores the edges across multiple machines so that the redundant use of computational and storage resources is minimized. Compared to baselines, CoCoS is: (a) accurate: giving up to 39x smaller estimation error; (b) fast: up to 10.4x faster, scaling linearly with the size of the input stream; and (c) theoretically sound: yielding unbiased estimates.
机译:给定图形流,我们如何使用具有有限的存储器的多台机器估计它的三角形数量?具体而言,应该如何在机器上处理和采样边缘,以便快速准确地估计?三角形(即,大小三分之三)的计数在许多应用中证明,包括异常检测,社区检测和链接推荐。对于大型和动态图形的三角形计数,最近的工作主要集中在媒体算法和分布式算法上,而是对“两全其美的最佳世界”的组合很少。在这项工作中,我们提出了一种快速准确的分布式流算法估计到每个节点的全局三角形(即,所有三角形)和本地三角形的计数。通过输入流,COCOS仔细处理并将边缘跨越多台机器存储,以便最小化计算和存储资源的冗余使用。与基线相比,Cocos是:(a)准确:提供高达39倍的估计误差; (b)快速:速度快10.4倍,用输入流的大小线性缩放; (c)理论上声音:产生无偏估计。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号