首页> 美国卫生研究院文献>Proceedings of the National Academy of Sciences of the United States of America >Colloquium PaperMapping Knowledge Domains: Tracking evolving communities in large linked networks
【2h】

Colloquium PaperMapping Knowledge Domains: Tracking evolving communities in large linked networks

机译:专题讨论会论文制图知识领域:跟踪大型链接网络中不断发展的社区

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We are interested in tracking changes in large-scale data by periodically creating an agglomerative clustering and examining the evolution of clusters (communities) over time. We examine a large real-world data set: the NEC CiteSeer database, a linked network of >250,000 papers. Tracking changes over time requires a clustering algorithm that produces clusters stable under small perturbations of the input data. However, small perturbations of the CiteSeer data lead to significant changes to most of the clusters. One reason for this is that the order in which papers within communities are combined is somewhat arbitrary. However, certain subsets of papers, called natural communities, correspond to real structure in the CiteSeer database and thus appear in any clustering. By identifying the subset of clusters that remain stable under multiple clustering runs, we get the set of natural communities that we can track over time. We demonstrate that such natural communities allow us to identify emerging communities and track temporal changes in the underlying structure of our network data.
机译:我们有兴趣通过定期创建一个聚集聚类并检查聚类(社区)随时间的演变来跟踪大规模数据的变化。我们研究了一个大型的现实数据集:NEC CiteSeer数据库,该链接网络包含250,000多篇论文。跟踪随时间的变化需要一种聚类算法,该算法可在输入数据的微小扰动下产生稳定的聚类。但是,CiteSeer数据的微小扰动会导致大多数群集发生重大变化。原因之一是社区内论文的合并顺序有些随意。但是,某些论文子集(称为自然群落)与CiteSeer数据库中的真实结构相对应,因此出现在任何聚类中。通过确定在多个聚类运行中保持稳定的聚类子集,我们可以获得随时间推移可以跟踪的自然群落集。我们证明了这种自然社区使我们能够识别新兴社区并跟踪网络数据底层结构的时间变化。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号