...
首页> 外文期刊>Procedia Computer Science >Community Detection On Citation Network Of DBLP Data Sample Set Using LinkRank Algorithm
【24h】

Community Detection On Citation Network Of DBLP Data Sample Set Using LinkRank Algorithm

机译:基于LinkRank算法的DBLP数据样本集引文网络社区检测。

获取原文
           

摘要

This paper describes the application of a community detection algorithm, namely LinkRank algorithm, on a citation network. Community detection is a task in network analysis which aims to find sets of tightly connected nodes that are loosely connected with other nodes outside of those sets. In our study, we focused on a citation network which depicts relationships between cited papers and the papers which cite those papers. The objectives of our study are to identify communities of papers based on the citation relationships and analyze the similarities of topics within each community. The approach of our study to reach the objectives is by applying LinkRank algorithm to a citation network. LinkRank algorithm is chosen because it can be applied to a directed network where other algorithms that we have surveyed can only be used on undirected network. The citation network that we used in our study is from Aminer website. In applying the algorithm, we had to port the original source code which is written in C programming language into Python programming language for our convenience in doing the experiment. The result shows that the algorithm able to detect 10,442 communities from 188,514 nodes. Once the communities have been detected, we sampled top three communities (the ones with the largest number of members) and took the top 10 nodes with the highest PageRank score in each of those communities. The samples show that most of the nodes have similar topic, but there are still some nodes with different topics mixed inside the same community. We found the ratio between nodes with similar and different topics to be 7 to 3, that is 70% of the nodes have similar topic while the other 30% have different topics. Thus, the homophily of each community does not reach 100%. Nevertheless, our study confirms that LinkRank algorithm can be used for community detection on directed network.
机译:本文描述了社区检测算法,即LinkRank算法,在引用网络上的应用。社区检测是网络分析中的一项任务,旨在查找紧密连接的节点集,这些节点与那些集合外的其他节点松散地连接。在我们的研究中,我们集中在一个引文网络上,该网络描述了被引用论文与引用这些论文的论文之间的关系。我们研究的目的是基于引文关系识别论文社区,并分析每个社区内主题的相似性。我们研究达到目标的方法是将LinkRank算法应用于引文网络。选择LinkRank算法是因为它可以应用于有向网络,而我们调查的其他算法只能在无向网络上使用。我们在研究中使用的引文网络来自Aminer网站。在应用该算法时,为了方便实验,我们必须将用C编程语言编写的原始源代码移植到Python编程语言中。结果表明,该算法能够从188,514个节点中检测10,442个社区。一旦检测到社区,我们就对前三个社区(成员数量最多的社区)进行采样,并在每个社区中抽取出具有最高PageRank得分的前10个节点。这些样本表明,大多数节点具有相似的主题,但是在同一社区中仍然有一些节点具有不同的主题。我们发现具有相似主题和不同主题的节点之间的比率为7:3,即70%的节点具有相似的主题,而其他30%的节点具有不同的主题。因此,每个社区的同质性都不会达到100%。尽管如此,我们的研究证实LinkRank算法可用于定向网络上的社区检测。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号