【24h】

Assembling a network out of ambiguous patches

机译:用模棱两可的补丁组装网络

获取原文

摘要

Many graph mining and network analysis problems rely on the availability of the full network over a set of nodes. But inferring a full network is sometimes non-trivial if the raw data is in the form of many small patches or subgraphs, of the true network, and if there are ambiguities in the identities of nodes or edges in these patches. This may happen because of noise or because of the nature of data; for instance, in social networks, names are typically not unique. Graph assembly refers to the problem of reconstructing a graph from these many, possibly noisy, partial observations. Prior work suggests that graph assembly is essentially impossible in regimes of interest when the true graph is Erdös-Rényi. The purpose of the present paper is to show that a modest amount of clustering is sufficient to assemble even very large graphs. We introduce the G(n, p; q) random graph model, which is the random closure over all open triangles of a G(n, p) Erdos-Renyi, and show that this model exhibits higher clustering than an equivalent Erdos-Renyi. We focus on an extreme case of graph assembly: the patches are small (1-hop egonets) and are unlabeled. We show that in realistic regimes, graph assembly is fundamentally feasible, because we can identify, for every edge e, a subgraph induced by its neighbors that is unique and present in every patch containing e. Using this result, we build a practical algorithm that uses canonical labeling to reconstruct the original graph from noiseless patches. We also provide an achievability result for noisy patches, which are obtained by edge-sampling the original egonets.
机译:许多图挖掘和网络分析问题都依赖于整个网络在一组节点上的可用性。但是,如果原始数据是真实网络的许多小补丁或子图的形式,并且这些补丁中的节点或边缘的身份不明确,则推断整个网络有时并非易事。这可能是由于噪声或数据的性质而发生的;例如,在社交网络中,名称通常不是唯一的。图装配是指从许多可能有噪声的部分观测结果中重建图的问题。先前的工作表明,当真正的图形是Erdös-Rényi时,在感兴趣的方案中图形组装基本上是不可能的。本文的目的是表明适度的聚类足以组装甚至非常大的图。我们介绍了G(n,p; q)随机图模型,该模型是对G(n,p)的鄂尔多斯-仁义的所有开放三角形的随机闭合,并表明该模型比同等的鄂尔多斯-仁义具有更高的聚类性。我们关注图装配的极端情况:面片很小(1跳egonets)并且没有标签。我们表明,在现实的情况下,图装配在根本上是可行的,因为我们可以为每个边缘e识别由其邻居诱导的子图,该子图是唯一的,并且存在于每个包含e的面片中。使用此结果,我们构建了一种实用的算法,该算法使用规范标记从无噪声的补丁中重建原始图形。我们还提供了对噪声斑块的可实现性结果,这些噪声斑块是通过对原始鹅蛋进行边缘采样而获得的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号