首页> 外文会议>Asia Information Retrieval Symposium(AIRS 2005); 20051013-15; Jeju Island(KR) >Effective Query Model Estimation Using Parsimonious Translation Model in Language Modeling Approach
【24h】

Effective Query Model Estimation Using Parsimonious Translation Model in Language Modeling Approach

机译:语言建模方法中使用简约翻译模型的有效查询模型估计

获取原文
获取原文并翻译 | 示例

摘要

The KL divergence framework, the extended language modeling approach has a critical problem with estimation of query model, which is the probabilistic model that encodes user's information need. At initial retrieval, estimation of query model by translation model had been proposed that involves term co-occurrence statistics. However, the translation model has a difficulty to applying, because term co-occurrence statistics must be constructed in offline. Especially in large collection, constructing such large matrix of term cooccurrences statistics prohibitively increases time and space complexity. More seriously, because translation model comprises noisy non-topical terms in documents, reliable retrieval performance cannot be guaranteed. This paper proposes an effective method to construct co-occurrence statistics and eliminate noisy terms by employing parsimonious translation model. Parsimonious translation model is a compact version of translation model and enables to drastically reduce number of terms that includes non-zero probabilities by eliminating non-topical terms in documents. From experimentations, we show that query model estimated from parsimonious translation model significantly outperforms not only baseline language modeling but also non-parsimonious model.
机译:KL散度框架,扩展语言建模方法在估计查询模型时存在一个关键问题,查询模型是对用户信息需求进行编码的概率模型。在最初的检索中,提出了通过翻译模型对查询模型进行估计的方法,该方法涉及术语共现统计。但是,翻译模型很难应用,因为术语共现统计必须离线构建。尤其是在大型馆藏中,构建如此大的术语共现统计矩阵会极大地增加时间和空间复杂性。更严重的是,由于翻译模型在文档中包含嘈杂的非主题词,因此无法保证可靠的检索性能。本文提出了一种有效的方法,利用简约翻译模型构造共现统计并消除噪声项。简约翻译模型是翻译模型的紧凑版本,可通过消除文档中的非主题术语来大幅减少包含非零概率的术语数量。通过实验,我们表明,从简约翻译模型估计的查询模型不仅明显优于基准语言建模,而且也优于非简约模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号