【24h】

Ranking the Web Frontier

机译:排名网站边界

获取原文

摘要

The celebrated PageRank algorithm has proved to be a very effective paradigm for ranking results of web search algorithms. In this paper we refine this basic paradigm to take into account several evolving prominent features of the web, and propose several algorithmic innovations. First, we analyze features of the rapidly growing “frontier” of the web, namely the part of the web that crawlers are unable to cover for one reason or another. We analyze the effect of these pages and find it to be significant. We suggest ways to improve the quality of ranking by modeling the growing presence of “link rot” on the web as more sites and pages fall out of maintenance. Finally we suggest new methods of ranking that are motivated by the hierarchical structure of the web, are more efficient than PageRank, and may be more resistant to direct manipulation.
机译:庆祝的PageRank算法已被证明是一个非常有效的范例,用于对网页搜索算法的排名结果进行排名。在本文中,我们将这种基本范例精确地考虑到了几个不断发展的网络突出特征,并提出了几种算法创新。首先,我们分析了Web的快速生长“边境”的特征,即爬虫无法覆盖的网站的一部分或另一个原因。我们分析了这些页面的效果,并发现它很重要。我们建议改善越来越多地存在“链接腐烂”作为更多网站和页面落下维护来提高排名质量的方法。最后,我们建议新的排名方法,这些排序方法是由网络的层次结构的动机,比Pagerank更有效,并且可能更耐受直接操纵。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号