首页> 外国专利> Information retrieval utilizing semantic representation of text and based on constrained expansion of query words

Information retrieval utilizing semantic representation of text and based on constrained expansion of query words

机译:利用文本语义表示和基于查询词约束扩展的信息检索

摘要

The present invention is directed to performing information retrieval utilizing semantic representation of text. In a preferred embodiment, a tokenizer generates from an input string information retrieval tokens that characterize the semantic relationship expressed in the input string. The tokenizer first creates from the input string a primary logical form characterizing a semantic relationship between selected words in the input string. The tokenizer then identifies hypemyms that each have an is a relationship with one of the selected words in the input string. The tokenizer then constructs from the primary logical form one or more alternative logical forms. The tokenizer constructs each alternative logical form by, for each of one or more of the selected words in the input string, replacing the selected word in the primary logical form with an identified hypernym of the selected word. Finally, the tokenizer generates tokens representing both the primary logical form and the alternative logical forms. The tokenizer is preferably used to generate tokens for both constructing an index representing target documents and processing a query against that index.
机译:本发明涉及利用文本的语义表示来执行信息检索。在优选实施例中,令牌生成器从输入字符串生成信息检索令牌,该令牌表征在输入字符串中表达的语义关系。分词器首先从输入字符串创建一个主要逻辑形式,该形式表征输入字符串中所选单词之间的语义关系。然后分词器识别每个与输入字符串中所选单词之一有关联的连字符。然后分词器从主要逻辑形式构造一个或多个替代逻辑形式。令牌生成器通过为输入字符串中的一个或多个选定单词的每一个,用标识出的选定单词的上位字母替换主要逻辑形式的选定单词,来构造每种替代逻辑形式。最后,令牌生成器生成表示主要逻辑形式和替代逻辑形式的令牌。令牌生成器优选地用于生成令牌,以用于构建表示目标文档的索引以及针对该索引处理查询。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号