首页> 外国专利> Document retrieval using internal dictionary-hierarchies to adjust per-subject match results

Document retrieval using internal dictionary-hierarchies to adjust per-subject match results

机译:使用内部字典层次结构调整文档的主题检索文档

摘要

Techniques for managing big data include retrieval using per-subject dictionaries having multiple levels of sub-classification hierarchy within the subject. Entries may include subject-determining-power (SDP) scores that provide an indication of the descriptive power of the entry term with respect to the subject of the dictionary containing the term. The same term may have entries in multiple dictionaries with different SDP scores in each of the dictionaries. A retrieval request for one or more documents containing search terms descriptive of the one or more documents can be processed by identifying a set of candidate documents tagged with subjects, i.e., identifiers of per-subject dictionaries having entries corresponding to a search term, then using affinity values to adjust the aggregate score for the terms in the dictionaries. Documents are then selected for best match to the subject based on the adjusted scores. Alternatively, the adjustment may be performed after selecting the documents by re-ordering them according to adjusted scores.
机译:用于管理大数据的技术包括使用主题范围内具有多个子分类层次结构的按主题词典进行检索。条目可以包括主题确定能力(SDP)得分,这些得分提供了输入条目相对于包含该条目的词典主题的描述能力的指示。同一术语可能在多个词典中都有条目,每个词典中的SDP分数都不同。可以通过识别一组标有主题的候选文档来处理对包含描述一个或多个文档的搜索词的一个或多个文档的检索请求,即,具有与搜索词相对应的条目的按主题词典的标识符,然后使用亲和力值,以调整词典中各个词的总得分。然后根据调整后的分数选择与主题最匹配的文档。可替代地,可以在选择文档之后通过根据调整后的分数对文档进行重新排序来进行调整。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号