【24h】

Rankboost-Based Result Merging

机译:基于Rankboost的结果合并

获取原文

摘要

The explosion of searchable text content especially on the web has rendered information to be distributed among many disjoint text information sources (Federated Search). How to merge the results returned by selected sources is a major problem of the Federated Search task. We study the problem of learning to rank a set of objects by combining various sources of ranking. The problem of merging search results arises in several domains, for example combining the results of different verticals and also Meta search applications. This paper presents a supervised learning solution to the result merging problem. Our approach combines multiple sources of evidence to inform the merging decision. We use the Rankboost Method, a boosting approach to machine learning which learns a function that merges results based on information that is readily available: i.e. the ranks, titles, summaries, URLs and click-through data, which are found in the results pages. We combine these evidences by treating result merging as a multiclass machine learning problem. By not downloading additional information such as the full document, we decrease processing cost in terms of bandwidth usage and latency. We compare our results against existing result merging methods which rely on evidence found only in ranked lists, Semi-Supervised Learning (SSL), Sample-Agglomerate Fitting Estimate (SAFE) and CORI. An extensive set of experiments demonstrates that our method is more effective than the baseline result-merging algorithm under a variety of conditions.
机译:尤其是在Web上,可搜索文本内容的爆炸式增长使信息可以分布在许多不相交的文本信息源中(联合搜索)。如何合并选定来源返回的结果是联合搜索任务的主要问题。我们研究了通过组合各种排名来源来学习对一组对象进行排名的问题。合并搜索结果的问题出现在多个领域,例如合并不同垂直行业的结果以及元搜索应用程序。本文提出了一种针对结果合并问题的监督学习解决方案。我们的方法结合了多种证据来为合并决策提供依据。我们使用Rankboost方法,这是一种促进机器学习的方法,它可以根据易于获得的信息(即在结果页面中找到的排名,标题,摘要,URL和点击数据)学习合并结果的功能。我们通过将结果合并视为多类机器学习问题来结合这些证据。通过不下载诸如完整文档之类的其他信息,我们降低了带宽使用和延迟的处理成本。我们将结果与现有结果合并方法进行比较,后者仅依赖于在排名列表,半监督学习(SSL),样本聚集拟合估计(SAFE)和CORI中找到的证据。大量的实验表明,在各种条件下,我们的方法比基线结果合并算法更有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号