首页> 外文会议>ACM conference on information and knowledge management >Exploiting Site-Level Information to Improve Web Search
【24h】

Exploiting Site-Level Information to Improve Web Search

机译:利用网站级信息以改进Web搜索

获取原文

摘要

Ranking Web search results has long evolved beyond simple bag-of-words retrieval models. Modern search engines routinely employ machine learning ranking that relies on exogenous relevance signals. Yet the majority of current methods still evaluate each Web page out of context. In this work, we introduce a novel source of relevance information for Web search by evaluating each page in the context of its host Web site. For this purpose, we devise two strategies for compactly representing entire Web sites. We formalize our approach by building two indices, a traditional page index and a new site index, where each "document" represents the an entire Web site. At runtime, a query is first executed against both indices, and then the final page score for a given query is produced by combining the scores of the page and its site. Experimental results carried out on a large-scale Web search test collection from a major commercial search engine confirm the proposed approach leads to consistent and significant improvements in retrieval effectiveness.
机译:排名网络搜索结果早已演变超出袋的词简单检索模型。现代搜索引擎习惯使用机器学习排序依赖外源性的相关信号。然而,大多数的现有方法仍评估每个网页断章取义。在这项工作中,我们通过在其宿主网站的背景下评估每个页面介绍了网络搜索相关信息的新来源。为此,我们设计两种策略的紧凑表示整个网站。我们通过建立两个指标,一个传统的网页索引和新的网站索引,其中每个“文件”代表了整个网站正式我们的做法。在运行时,查询首先针对这两个指数执行,然后给定查询的最后一页得分是通过结合网页和其网站的得分产生。实验结果从一个主要的商业搜索引擎确认检索效率开展了大规模的网络搜索测试集所提出的方法导致一致的,显著的改善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号