首页> 外文期刊>Computer Networks >Finding related pages in the World Wide Web
【24h】

Finding related pages in the World Wide Web

机译:在万维网上查找相关页面

获取原文
获取原文并翻译 | 示例
       

摘要

When using traditional search engines, users have to formulate queries to describe their information need. This paper discusses a different approach to Web searching where the input to the search process is not a set of query terms, but instead is the URL of a page, and the output is a set of related Web pages. A related Web page is one that addresses the same topic as the original page. For example, www. washingtonpost. com is a page related to www. nytimes. com, since both are online newspapers, We describe two algorithms to identify related Web pages. These algorithms use only the connectivity information in the Web (i.e., the links between pages) and not the content of pages or usage information. We have imple- mented both algorithms and measured their runtime performance. To evaluate the effectiveness of our algorithms, we performed a user study comparing our algorithms with Netscape's ‘What's Related' service (http: / /home. netscape. com/ escapes /related/). Our study showed that the precision at 10 for our two algorithms are 73/100 better and 51/100 better than that of Netscape, despite the fact that Netscape uses both content and usage pattern information in addition to connectivity information.
机译:使用传统搜索引擎时,用户必须制定查询来描述其信息需求。本文讨论了一种不同的Web搜索方法,其中搜索过程的输入不是一组查询词,而是页面的URL,而输出是一组相关的Web页面。相关的网页是一个与原始网页讨论相同主题的网页。例如,www。华盛顿邮报。 com是与www相关的页面。 nytimes。 com,因为它们都是在线报纸,所以我们描述了两种识别相关网页的算法。这些算法仅使用Web中的连接性信息(即页面之间的链接),而不使用页面的内容或使用情况信息。我们已经实现了这两种算法并测量了它们的运行时性能。为了评估我们算法的有效性,我们进行了一项用户研究,将我们的算法与Netscape的“相关内容”服务(http://home.netscape.com/escapes/related/)进行了比较。我们的研究表明,尽管Netscape除了使用连接性信息外,还同时使用内容和使用模式信息,但两种算法在10处的精度分别比Netscape高73/100和51/100。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号