首页> 外文期刊>Systems and Computers in Japan >Analysis and Improvement of HITS Algorithm for Detecting WEB Communities
【24h】

Analysis and Improvement of HITS Algorithm for Detecting WEB Communities

机译:Web社区检测的HITS算法分析与改进

获取原文
获取原文并翻译 | 示例
           

摘要

This paper discusses Kleinberg's HITS algorithm (hyperlink-induced topic search) that extracts the Web community by Web inherent hyperlink analysis. The problems of the algorithm are analyzed and an improvement is proposed. For this purpose, a tool (Link Viewer) that visualizes the operation process of HITS algorithm was developed. The analysis revealed the following problem of the HITS algorithm: when there exists a page in the base set which is not related to the original topic at all and has a dense link structure, it is impossible to extract the Web community (authority and hub) matched to the original topic (topic drift problem). The authors focused only on the link analysis, and proposed the following modifications: (1) a technique in the eigenvalue calculation to consider the projection on the root subspace; (2) a technique for iterative calculation by extracting only the page from the base set which has link relations to multiple pages in the root set. A technique combining (1) and (2) is also considered. As a result, the topic drift problem is avoided for any topic with a relatively small amount of computation, and the HITS algorithm is improved by using the link information.
机译:本文讨论了Kleinberg的HITS算法(超链接诱导的主题搜索),该算法通过Web固有的超链接分析来提取Web社区。分析了该算法存在的问题,提出了改进方案。为此,开发了一种可视化HITS算法操作过程的工具(链接查看器)。分析揭示了HITS算法的以下问题:如果基本集中存在一个页面,该页面与原始主题根本不相关并且具有密集的链接结构,则无法提取Web社区(权威和中心)与原始主题匹配(主题漂移问题)。作者仅关注链接分析,并提出了以下修改:(1)特征值计算中考虑根子空间投影的技术; (2)一种迭代计算技术,它仅从基本集中提取与根集中的多个页面具有链接关系的页面。还考虑了将(1)和(2)组合的技术。结果,避免了对于具有相对少量计算的任何主题的主题漂移问题,并且通过使用链接信息来改进了HITS算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号