【24h】

A Case Study in Web Search using TREC Algorithms

机译:使用TREC算法的Web搜索案例研究

获取原文
获取原文并翻译 | 示例

摘要

Web search engines rank potentially relevant pages/sites for a user query. Ranking documents for user queries has also been at the heart of the Text REtrieval Conference (TREC in short) under the label ad-hoc retrieval. The TREC community has developed document ranking algorithms that are known to be the best for searching the document collections used in TREC, which are mainly comprised of newswire text. However, the web search community has developed its own methods to rank web pages/sites, many of which use link structure on the web, and are quite different from the algorithms developed at TREC. This study evaluates the performance of a state-of-the-art keyword-based document ranking algorithm (coming out of TREC) on a popular web search task: finding the web page/site of an entity, e.g. companies, universities, organizations, individuals, etc. This form of querying is quite prevalent on the web. The results from the TREC algorithms are compared to four commercial web search engines. Results show that for finding the web page/site of an entity, commercial web search engines are notably better than a state-of-the-art TREC algorithm. These results are in sharp contrast to results from several previous studies.
机译:Web搜索引擎对用户查询的潜在相关页面/站点进行排名。在临时检索标签下,文本检索会议(简称TREC)的核心也是用于用户查询的文档排名。 TREC社区已经开发了文档排名算法,该算法被认为是搜索TREC中使用的文档集合的最佳算法,这些文档集合主要由新闻通讯文本组成。但是,网络搜索社区已经开发了自己的方法来对网页/站点进行排名,其中许多方法使用网络上的链接结构,并且与TREC开发的算法完全不同。这项研究评估了最流行的基于关键字的文档排名算法(来自TREC)在流行的网络搜索任务上的性能:查找实体的网页/站点,例如公司,大学,组织,个人等。这种查询形式在网络上非常普遍。将TREC算法的结果与四个商业Web搜索引擎进行比较。结果表明,对于查找实体的网页/站点,商业Web搜索引擎明显优于最新的TREC算法。这些结果与先前的一些研究结果形成鲜明对比。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号