首页> 外文OA文献 >A unified framework for monolingual and cross-lingual relevance modeling based on probabilistic topic models
【2h】

A unified framework for monolingual and cross-lingual relevance modeling based on probabilistic topic models

机译:基于概率主题模型的单语言和跨语言相关性建模的统一框架

摘要

We explore the potential of probabilistic topic modeling within the relevance modeling framework for both monolingual and cross-lingual ad-hoc retrieval. Multilingual topic models provide a way to represent documents in a structured and coherent way, regardless of their actual language, by means of language-independent concepts, that is, cross-lingual topics. We show how to integrate the topical knowledge into a unified relevance modeling framework in order to build quality retrieval models in monolingual and cross-lingual contexts. The proposed modeling framework processes all documents uniformly and does not make any conceptual distinction between monolingual and cross-lingual modeling. Our results obtained from the experiments conducted on the standard CLEF test collections reveal that fusing the topical knowledge and relevance modeling leads to building monolingual and cross-lingual retrieval models that outperform several strong baselines. We show that that the topical knowledge coming from a general Web-generated corpus boosts retrieval scores. Additionally, we show that within this framework the estimation of cross-lingual relevance models may be performed by exploiting only a general non-parallel corpus.
机译:我们在单语言和跨语言即席检索的相关性建模框架内探索概率主题建模的潜力。多语言主题模型通过独立于语言的概念(即跨语言主题)提供了一种以结构化和连贯的方式表示文档的方法,而不管它们的实际语言是什么。我们展示了如何将主题知识集成到统一的相关性建模框架中,以便在单语言和跨语言环境中构建高质量的检索模型。拟议的建模框架统一处理所有文档,并且在单语言和跨语言建模之间没有任何概念上的区别。我们从标准CLEF测试集中进行的实验中获得的结果表明,将主题知识和相关性模型进行融合会导致建立单语言和跨语言的检索模型,而这些模型的性能要优于几个强大的基准。我们表明,来自一般的Web生成语料库的主题知识可以提高检索分数。此外,我们表明,在此框架内,可以通过仅利用通用的非平行语料库来进行跨语言相关性模型的估计。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号