...
首页> 外文期刊>Big Data, IEEE Transactions on >An Empirical Comparison of Algorithms to Find Communities in Directed Graphs and Their Application in Web Data Analytics
【24h】

An Empirical Comparison of Algorithms to Find Communities in Directed Graphs and Their Application in Web Data Analytics

机译:在有向图中查找社区的算法的经验比较及其在Web数据分析中的应用

获取原文
获取原文并翻译 | 示例
           

摘要

Detecting communities in graphs is a fundamental tool to understand the structure of Web-based systems and predict their evolution. Many community detection algorithms are designed to process undirected graphs (i.e., graphs with bidirectional edges) but many graphs on the Web-e.g., microblogging Web sites, trust networks or the Web graph itself-are often directed. Few community detection algorithms deal with directed graphs but we lack their experimental comparison. In this paper we evaluated some community detection algorithms across accuracy and scalability. A first group of algorithms (Label Propagation and Infomap) are explicitly designed to manage directed graphs while a second group (e.g., WalkTrap) simply ignores edge directionality; finally, a third group of algorithms (e.g., Eigenvector) maps input graphs onto undirected ones and extracts communities from the symmetrized version of the input graph. We ran our tests on both artificial and real graphs and, on artificial graphs, WalkTrap achieved the highest accuracy, closely followed by other algorithms; Label Propagation has outstanding performance in scalability on both artificial and real graphs. The Infomap algorithm showcased the best trade-off between accuracy and computational performance and, therefore, it has to be considered as a promising tool for Web Data Analytics purposes.
机译:在图中检测社区是了解基于Web的系统的结构并预测其发展的基本工具。许多社区检测算法被设计为处理无向图(即,具有双向边缘的图),但是Web上的许多图(例如,微博客网站,信任网络或Web图本身)通常是有方向的。很少有社区检测算法可以处理有向图,但是我们缺乏它们的实验比较。在本文中,我们评估了一些跨准确性和可伸缩性的社区检测算法。第一组算法(标签传播和Infomap)被明确设计为管理有向图,而第二组算法(例如WalkTrap)只是忽略了边缘方向性;最后,第三组算法(例如Eigenvector)将输入图映射到无向图,并从输入图的对称版本中提取社区。我们在人工图和真实图上都进行了测试,在人工图上,WalkTrap达到了最高的准确性,紧随其后的是其他算法;标签传播在人工图和实图上都具有出色的可伸缩性。 Infomap算法展示了准确性和计算性能之间的最佳平衡,因此,必须将其视为Web数据分析目的的有前途的工具。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号