首页> 外国专利> Efficient Retrieval Algorithm by Query Term Discrimination

Efficient Retrieval Algorithm by Query Term Discrimination

机译:通过查询词区分的高效检索算法

摘要

An exemplary method for use in information retrieval includes, for each of a plurality of terms, selecting a predetermined number of top scoring documents for the term to form a corresponding document set for the term; receiving a plurality of terms, optionally as a query; ranking the plurality of terms for importance based at least in part on the document sets for the plurality of terms where the ranking comprises using an inverse document frequency algorithm; selecting a number of ranked terms based on importance where each selected, ranked term comprises its corresponding document set wherein each document in a respective document set comprises a document identification number; forming a union set based on the document sets associated with the selected number of ranked terms; and, for a document identification number in the union set, scanning a document set corresponding to an unselected term for a matching document identification number. Various other exemplary systems, methods, devices, etc. are also disclosed.
机译:用于信息检索的示例性方法包括:对于多个术语中的每个术语,为该术语选择预定数量的得分最高的文档,以形成该术语的对应文档集;接收多个术语,可选地作为查询;至少部分地基于所述多个术语的文档集对所述多个术语的重要性进行排名,其中所述排名包括使用逆文档频率算法;基于重要性,选择多个排名词,其中每个选择的排名词包括其对应的文档集,其中各个文档集中的每个文档包括文档标识号;根据与所选数量的排名词相关的文档集形成并集;对于联合集合中的文档标识号,扫描与未选择的术语相对应的文档集以找到匹配的文档标识号。还公开了各种其他示例性系统,方法,设备等。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号