首页> 外文会议>IEEE International Conference on Data Engineering >Mining large graphs: Algorithms, inference, and discoveries
【24h】

Mining large graphs: Algorithms, inference, and discoveries

机译:挖掘大图:算法,推论和发现

获取原文

摘要

How do we find patterns and anomalies, on graphs with billions of nodes and edges, which do not fit in memory? How to use parallelism for such terabyte-scale graphs? In this work, we focus on inference, which often corresponds, intuitively, to “guilt by association” scenarios. For example, if a person is a drug-abuser, probably its friends are so, too; if a node in a social network is of male gender, his dates are probably females. We show how to do inference on such huge graphs through our proposed HAdoop Line graph Fixed Point (Ha-Lfp), an efficient parallel algorithm for sparse billion-scale graphs, using the Hadoop platform. Our contributions include (a) the design of Ha-Lfp, observing that it corresponds to a fixed point on a line graph induced from the original graph; (b) scalability analysis, showing that our algorithm scales up well with the number of edges, as well as with the number of machines; and (c) experimental results on two private, as well as two of the largest publicly available graphs — the Web Graphs from Yahoo! (6.6 billion edges and 0.24 Tera bytes), and the Twitter graph (3.7 billion edges and 0.13 Tera bytes). We evaluated our algorithm using M45, one of the top 50 fastest supercomputers in the world, and we report patterns and anomalies discovered by our algorithm, which would be invisible otherwise.
机译:我们如何找到模式和异常,在数十亿节点和边缘的图表上,不适合内存?如何为这种Terabyte级图使用并行性?在这项工作中,我们专注于推理,这通常对应,直观地对应于“通过关联内疚”的情景。例如,如果一个人是药物滥用者,可能是它的朋友也是如此;如果社交网络中的节点是男性性别,他的日期可能是女性。我们展示了如何通过我们所提出的Hadoop线条固定点(HA-LFP),这是一种使用Hadoop平台的稀疏亿尺度图表的有效并行算法的推断。我们的贡献包括(a)HA-LFP的设计,观察到它对应于从原始图中诱导的线图上的固定点; (b)可扩展性分析,显示我们的算法与边缘的数量以及机器数量均匀缩放; (c)两个私有的实验结果,以及两个最大的公共图形 - 来自雅虎的网图! (66亿边缘和0.24 Tera字节),Twitter图(37亿边和0.13 Tera字节)。我们使用M45评估了我们世界上第50个最快的超级计算机之一的算法,我们报告了我们的算法发现的模式和异常,否则将是不可见的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号