Mixture Model with Multiple Centralized Retrieval Algorithms for Result Merging in Federated Search

机译：联合搜索中用于结果合并的具有多个集中检索算法的混合模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Result merging is an important research problem in federated search for merging documents retrieved from multiple ranked lists of selected information sources into a single list. The state-of-the-art result merging algorithms such as Semi-Supervised Learning (SSL) and Sample-Agglomerate Fitting Estimate (SAFE) try to map document scores retrieved from different sources to comparable scores according to a single centralized retrieval algorithm for ranking those documents. Both SSL and SAFE arbitrarily select a single centralized retrieval algorithm for generating comparable document, scores, which is problematic in a heterogeneous federated search environment, since a single centralized algorithm is often suboptimal for different information sources. Based on this observation, this paper proposes a novel approach for result merging by utilizing multiple centralized retrieval algorithms. One simple approach is to learn a set of combination weights for multiple centralized retrieval algorithms (e.g., logistic regression) to compute comparable document scores. The paper shows that this simple approach generates suboptimal results as it is not flexible enough to deal with heterogeneous information sources. A mixture probabilistic model is thus proposed to learn more appropriate combination weights with respect to different types of information sources with some training data. An extensive set of experiments on three datasets have proven the effectiveness of the proposed new approach.

机译：结果合并是联合搜索中的一个重要研究问题，该联合搜索用于将从选定信息源的多个已排序列表中检索到的文档合并到单个列表中。最先进的结果合并算法（例如半监督学习（SSL）和样本聚集拟合估计（SAFE））尝试根据单个集中式检索算法将从不同来源检索的文档分数映射到可比分数这些文件。 SSL和SAFE都随意选择一个集中式检索算法来生成可比较的文档分数，这在异构联合搜索环境中是有问题的，因为单个集中式算法对于不同的信息源通常不是最佳的。基于这种观察，本文提出了一种利用多种集中式检索算法进行结果合并的新方法。一种简单的方法是为多个集中式检索算法（例如，逻辑回归）学习一组组合权重，以计算可比较的文档分数。本文表明，这种简单的方法会产生次优的结果，因为它不够灵活，无法处理异构信息源。因此，提出了一种混合概率模型，以通过一些训练数据针对不同类型的信息源学习更合适的组合权重。在三个数据集上进行的大量实验证明了该新方法的有效性。

著录项

来源
《International ACM SIGIR conference on research development in information retrieval》|2012年|821-830|共10页
会议地点 Portland OR(US)
作者
Dzung Hong; Luo Si;
展开▼
作者单位

Department of Computer Science Purdue University 250 N. University Street West Lafayette IN 47907 USA;

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Federated Search; Result Merging; Mixture Model;

机译：联合搜索；结果合并；混合模型;

相似文献

外文文献
中文文献
专利

1. An effective and efficient results merging strategy for multilingual information retrieval in federated search environments [J] . Luo Si, Jamie Callan, Suleyman Cetintas, Information retrieval . 2008,第1期

机译：联合搜索环境中用于多语言信息检索的有效高效的结果合并策略
2. A Mean-Variance Analysis Based Approach for Search Result Diversification in Federated Search [J] . Ghansah Benjamin, Wu Shengli International Journal of Uncertainty, Fuzziness, and Knowledge-based Systems . 2016,第2期

机译：基于均值方差分析的联合搜索中搜索结果多样化方法
3. Comparison of the Noise Robustness of FVC Retrieval Algorithms Based on Linear Mixture Models [J] . Hiroki Yoshioka, Kenta Obata Remote Sensing . 2011,第7期

机译：基于线性混合模型的FVC检索算法的噪声鲁棒性比较
4. Mixture Model with Multiple Centralized Retrieval Algorithms for Result Merging in Federated Search [C] . Dzung Hong, Luo Si International ACM SIGIR conference on research development in information retrieval . 2012

机译：具有多种集中检索算法的混合模型，用于联合搜索中的结果合并
5. Merging multiple search results approach for meta-search engines. [D] . Mohamed, Khaled Abd-El-Fatah. 2005

机译：为元搜索引擎合并多个搜索结果方法。
6. Retrieval Algorithms for Road Surface Modelling Using Laser-Based Mobile Mapping [O] . Anttoni Jaakkola, Juha Hyyppä, Hannu Hyyppä, 2008

机译：基于激光的移动映射的路面建模检索算法
7. A Weighted Curve Fitting Method for Result Merging in Federated Search [O] . Chuan He, Dzung Hong, Luo Si 2014

机译：联邦搜索中结果合并的加权曲线拟合方法

Mixture Model with Multiple Centralized Retrieval Algorithms for Result Merging in Federated Search

摘要

著录项

相似文献

相关主题

期刊订阅