首页> 外文期刊>Information Processing & Management >Query expansion and dimensionality reduction: Notions of optimality in Rocchio relevance feedback and latent semantic indexing
【24h】

Query expansion and dimensionality reduction: Notions of optimality in Rocchio relevance feedback and latent semantic indexing

机译:查询扩展和降维:Rocchio相关性反馈和潜在语义索引中的最佳概念

获取原文
获取原文并翻译 | 示例
       

摘要

Rocchio relevance feedback and latent semantic indexing (LSI) are well-known extensions of the vector space model for information retrieval (IR). This paper analyzes the statistical relationship between these extensions. The analysis focuses on each method's basis in least-squares optimization. Noting that LSI and Rocchio relevance feedback both alter the vector space model in a way that is in some sense least-squares optimal, we ask: what is the relationship between LSI's and Rocchio's notions of optimality? What does this relationship imply for IR? Using an analytical approach, we argue that Rocchio relevance feedback is optimal if we understand retrieval as a simplified classification problem. On the other hand, LSl's motivation comes to the fore if we understand it as a biased regression technique, where projection onto a low-dimensional orthogonal subspace of the documents reduces model variance.
机译:Rocchio相关性反馈和潜在语义索引(LSI)是用于信息检索(IR)的向量空间模型的众所周知的扩展。本文分析了这些扩展之间的统计关系。分析着眼于最小二乘优化中每种方法的基础。注意到LSI和Rocchio相关性反馈都以某种意义上最小二乘最优的方式改变了向量空间模型,我们问:LSI和Rocchio的最优概念之间的关系是什么?这种关系对投资者关系意味着什么?使用分析方法,我们认为如果我们将检索理解为简化的分类问题,则Rocchio相关性反馈是最佳的。另一方面,如果我们将LS1理解为一种有偏回归技术,那么LS1的动机就显得尤为重要,其中投影到文档的低维正交子空间上可以减少模型差异。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号