首页> 外文OA文献 >Automatic query expansion : a structural linguistic perspective
【2h】

Automatic query expansion : a structural linguistic perspective

机译:自动查询扩展:结构语言学的观点

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

A user’s query is considered to be an imprecise description of their information need. Automatic query expansion is the process of reformulating the original query with the goal of improving retrieval effectiveness. Many successful query expansion techniques ignore information about the dependencies that exist between words in natural language. However, more recent approaches have demonstrated that by explicitly modeling associations between terms significant improvements in retrieval effectiveness can be achieved over those that ignore these dependencies. State-of-the-art dependency-based approaches have been shown to primarily model syntagmatic associations. Syntagmatic associations infer a likelihood that two terms co-occur more often than by chance. However, structural linguistics relies on both syntagmatic and paradigmatic associations to deduce the meaning of a word. Given the success of dependency-based approaches and the reliance on word meanings in the query formulation process, we argue that modeling both syntagmatic and paradigmatic information in the query expansion process will improve retrieval effectiveness. This article develops and evaluates a new query expansion technique that is based on a formal, corpus-based model of word meaning that models syntagmatic and paradigmatic associations. We demonstrate that when sufficient statistical information exists, as in the case of longer queries, including paradigmatic information alone provides significant improvements in retrieval effectiveness across a wide variety of data sets. More generally, when our new query expansion approach is applied to large-scale web retrieval it demonstrates significant improvements in retrieval effectiveness over a strong baseline system, based on a commercial search engine.
机译:用户的查询被认为是对其信息需求的不准确描述。自动查询扩展是重新构造原始查询的过程,目的是提高检索效率。许多成功的查询扩展技术都忽略了有关自然语言单词之间存在的依存关系的信息。但是,最近的方法表明,通过对术语之间的关联进行显式建模,与忽略这些依赖项的检索效率相比,可以显着提高检索效率。基于最新的依存关系的方法已被证明主要用于建模标记关联。组合联想推断出两个词共同出现的可能性大于偶然发生的可能性。但是,结构语言学既依赖于语系关联,也依赖于范式关联来推断单词的含义。鉴于基于依赖的方法的成功以及查询表达过程中对词义的依赖,我们认为在查询扩展过程中对标记信息和范式信息进行建模将提高检索效率。本文开发并评估了一种新的查询扩展技术,该技术基于基于语料库的正式词义模型,该模型对标记性和范式关联进行建模。我们证明了,当存在足够的统计信息时,例如在较长的查询中,仅包括范式信息就可以显着提高跨各种数据集的检索效率。更笼统地说,当我们将新的查询扩展方法应用于大规模Web检索时,它证明了基于商业搜索引擎的强大基线系统在检索效率方面的显着提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号