首页> 外文会议>European Conference on IR Research >Teaching a New Dog Old Tricks: Resurrecting Multilingual Retrieval Using Zero-Shot Learning
【24h】

Teaching a New Dog Old Tricks: Resurrecting Multilingual Retrieval Using Zero-Shot Learning

机译:教一只新狗老技巧:使用零射击学习恢复多语言检索

获取原文

摘要

While billions of non-English speaking users rely on search engines every day, the problem of ad-hoc information retrieval is rarely studied for non-English languages. This is primarily due to a lack of data set that are suitable to train ranking algorithms. In this paper, we tackle the lack of data by leveraging pre-trained multilingual language models to transfer a retrieval system trained on English collections to non-English queries and documents. Our model is evaluated in a zero-shot setting, meaning that we use them to predict relevance scores for query-document pairs in languages never seen during training. Our results show that the proposed approach can significantly outperform unsupervised retrieval techniques for Arabic, Chinese Mandarin, and Spanish. We also show that augmenting the English training collection with some examples from the target language can sometimes improve performance.
机译:尽管每天都有数十亿非英语用户依赖搜索引擎,但对于非英语语言,很少研究即席信息检索的问题。这主要是由于缺乏适合训练排名算法的数据集。在本文中,我们通过利用预先训练的多语言语言模型将经过英语培训的检索系统转移到非英语查询和文档中,从而解决了数据不足的问题。我们的模型是在零射环境下进行评估的,这意味着我们可以使用它们来预测培训期间从未见过的语言的查询文档对的相关性得分。我们的结果表明,该方法可以大大优于阿拉伯语,中文普通话和西班牙语的无监督检索技术。我们还表明,使用目标语言中的一些示例来扩充英语培训资源有时可以提高性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号