首页> 外文会议>International ACM SIGIR conference on research development in information retrieval >SimFusion+: Extending SimFusion Towards Efficient Estimation on Large and Dynamic Networks
【24h】

SimFusion+: Extending SimFusion Towards Efficient Estimation on Large and Dynamic Networks

机译:SimFusion +:将SimFusion扩展到大型动态网络上的有效估计

获取原文

摘要

SimFusion has become a captivating measure of similarity between objects in a web graph, It is iteratively distilled from the notion that "the similarity between two objects is reinforced by the similarity of their related objects". The existing SimFusion model usually exploits the Unified Relationship Matrix (URM) to represent latent relationships among heterogeneous data, and adopts an iterative paradigm for SimFusion computation. However, due to the row normalization of URM, the traditional SimFusion model may produce the trivial solution; worse still, the iterative computation of SimFusion may not ensure the global convergence of the solution. This paper studies the revision of this model, providing a full treatment from complexity to algorithms. (1) We propose SirnFu-sion+ based on a notion of the Unified Adjacency Matrix (UAM), a modification of the URM, to prevent the trivial solution and the divergence issue of SimFusion. (2) We show thai for any vertex-pair, SimFusion+- can be performed in O(1) time and O(n) space with an O(km)-time precomputation done only once, as opposed to the O(kn~3 ) time and O(n~2) space of its traditional counterpart, where n, m, and k denote the number of vertices, edges, and iterations respectively. (3) We also devise an incremental algorithm for further improving the computation of SimFusion+ when networks are dynamically updated, with performance guarantees for similarity estimation. We experimentally verify that these algorithms scale well, and the revised notion of SimFusion is able to converge to a non-trivial solution, and allows us to identify more sensible structure information in large real-world networks.
机译:SimFusion已成为Web图形中对象之间相似性的一种迷人度量,它从“两个对象之间的相似性通过其相关对象的相似性得到增强”的概念中反复提炼出来。现有的SimFusion模型通常利用统一关系矩阵(URM)表示异构数据之间的潜在关系,并为SimFusion计算采用迭代范式。但是,由于URM的行规范化,传统的SimFusion模型可能会产生微不足道的解决方案。更糟糕的是,SimFusion的迭代计算可能无法确保解决方案的全局收敛性。本文研究了该模型的修订版,提供了从复杂性到算法的全面处理。 (1)我们提出了SirnFu-sion +,它基于对URM的修改的统一邻接矩阵(UAM)的概念,以防止琐碎的解决方案和SimFusion的分歧问题。 (2)我们证明对于任何顶点对,SimFusion +-都可以在O(1)时间和O(n)空间中执行,而O(km)时间预计算仅执行一次,而不是O(kn〜 3)时间和传统空间的O(n〜2)空间,其中n,m和k分别表示顶点数,边数和迭代数。 (3)我们还设计了一种增量算法,用于在网络动态更新时进一步改进SimFusion +的计算,并为相似性估计提供性能保证。我们通过实验验证了这些算法的可扩展性,并且SimFusion的修订概念能够收敛到非平凡的解决方案,并允许我们在大型现实世界网络中识别更明智的结构信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号