【24h】

Multilingual PRF: English Lends a Helping Hand

机译:多语种PRF:英语助一臂之力

获取原文
获取原文并翻译 | 示例

摘要

In this paper, we present a novel approach to Pseudo-Relevance Feedback (PRF) called Multilingual PRF (MultiPRF). The key idea is to harness multilinguality. Given a query in a language, we take the help of another language to ameliorate the well known problems of PRF, viz. (a) The expansion terms from PRF are primarily based on co-occurrence relationships with query terms, and thus other terms which are lexically and semantically related, such as morphological variants and synonyms, are not explicitly captured, and (b) PRF is quite sensitive to the quality of the initially retrieved top k documents and is thus not robust. In MultiPRF, given a query in language L, it is translated into language L2 and PRF is performed on a collection in language L2 and the resultant feedback model is translated from L2 back into L. The final feedback model is obtained by combining the translated model with the original feedback model of the query in L. Experiments were performed on standard CLEF collections in languages with widely differing characteristics, viz., French, German, Finnish and Hungarian with English as the assisting language. We observe that MultiPRF outperforms PRF and is more robust with consistent and significant improvements in the above widely differing languages. A thorough analysis of the results reveal that the second language helps in obtaining both co-occurrence based conceptual terms as well as lexically and semantically related terms. Additionally, the use of the second language collection reduces the sensitivity to performance of initial retrieval, thereby making it more robust.
机译:在本文中,我们提出了一种新的伪相关反馈(PRF)方法,称为多语言PRF(MultiPRF)。关键思想是利用多种语言。给定一种语言的查询,我们将借助另一种语言来改善PRF的众所周知的问题,即。 (a)PRF中的扩展词主要基于与查询词的共现关系,因此未明确捕获其他词法和语义上相关的词,例如形态变体和同义词,并且(b)PRF相当对最初检索的前k个文档的质量敏感,因此不可靠。在MultiPRF中,给定使用语言L的查询,将其翻译为语言L2,然后对使用语言L2的集合执行PRF,并将生成的反馈模型从L2转换回L。通过组合翻译后的模型可以获得最终的反馈模型。使用L中查询的原始反馈模型。对标准CLEF馆藏进行了实验,实验语言以多种语言为特色,例如法语,德语,芬兰语和匈牙利语,其中英语为辅助语言。我们观察到,MultiPRF的性能优于PRF,并且在上述多种语言中具有一致且显着的改进,因此更加健壮。对结果的透彻分析表明,第二语言有助于获得基于共现的概念性术语以及与词汇和语义相关的术语。另外,第二语言集合的使用降低了对初始检索的敏感性,从而使其更加健壮。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号