首页> 外国专利> Methods and apparatus for information indexing and retrieval as well as query expansion using morpho-syntactic analysis

Methods and apparatus for information indexing and retrieval as well as query expansion using morpho-syntactic analysis

机译:利用形态句法分析进行信息索引和检索以及查询扩展的方法和装置

摘要

An index generator and query expander for use in information retrieval in a corpus. A corpus is provided as an input to an inflectional analyzer, which produces a lemmatized corpus having base forms and associated inflections for each word in the original corpus. The lemmatized corpus is provided as an input to a disambiguator, which performs part of speech tagging and morpho-syntactic disambiguation to produce a disambiguated corpus. The disambiguated corpus is provided as an input to a derivational generator, which produces an expanded corpus having all possible valid derivatives of each word of the disambiguated corpus. The disambiguated corpus is provided as an input to a transformational analyzer, using a grammar and a metagrammar for analyzing syntactic and morphosyntactic variations to conflate and generate variants, producing an index to the corpus having a minimum of variants. Alternatively, a query expander is provided utilizing similar techniques.
机译:索引生成器和查询扩展器,用于语料库中的信息检索。语料库被提供给变形分析器作为输入,该词法分析器针对原始语料库中的每个单词生成具有基本形式和相关词尾变化的去词义语料库。消歧的语料库作为消歧器的输入,消歧器执行语音标记和词法和句法消歧的一部分,以产生消歧的语料。歧义语料库作为输入提供给派生生成器,后者生成扩展的语料库,该扩展语料库具有歧义语料库每个单词的所有可能有效导数。歧义语料库作为转换分析器的输入,使用语法和元语法分析语法和词法句法变体以合并和生成变体,从而产生具有最少变体的语料库索引。备选地,利用类似技术提供查询扩展器。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号