首页> 外文会议>European conference on IR research >Cross-Lingual Document Retrieval Using Regularized Wasserstein Distance
【24h】

Cross-Lingual Document Retrieval Using Regularized Wasserstein Distance

机译:使用正则化的Wasserstein距离进行跨语言文档检索

获取原文

摘要

Many information retrieval algorithms rely on the notion of a good distance that allows to efficiently compare objects of different nature. Recently, a new promising metric called Word Mover's Distance was proposed to measure the divergence between text passages. In this paper, we demonstrate that this metric can be extended to incorporate term-weighting schemes and provide more accurate and computationally efficient matching between documents using entropic regularization. We evaluate the benefits of both extensions in the task of cross-lingual document retrieval (CLDR). Our experimental results on eight CLDR problems suggest that the proposed methods achieve remarkable improvements in terms of Mean Reciprocal Rank compared to several baselines.
机译:许多信息检索算法都依赖于有效距离的概念,该距离可以有效地比较不同性质的对象。最近,提出了一种新的有前途的度量标准,称为“单词移动器的距离”,用于测量文本段落之间的差异。在本文中,我们证明了该度量可以扩展为合并术语加权方案,并使用熵正则化在文档之间提供更准确和计算效率更高的匹配。我们评估两种扩展在跨语言文档检索(CLDR)任务中的好处。我们对8个CLDR问题的实验结果表明,与多个基准相比,所提出的方法在平均倒数排名方面取得了显着改善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号