首页> 外国专利> Efficient computation of document similarity

Efficient computation of document similarity

机译:高效计算文档相似度

摘要

Systems, methodologies, media, and other embodiments associated with efficiently computing document similarity are described. One exemplary system embodiment includes logic to produce a gram from a string and logic to identify candidate documents based on identifying matches between query grams and document grams stored in an inverted index that relates grams to documents. The example system may also include logic to selectively partially reconstruct a candidate document from entries in the inverted index and logic to compute an edit distance between a string associated with a query and a string associated with the partially reconstructed candidate document. The example system may also include a signal logic configured to provide a signal corresponding to the edit distance.
机译:描述了与有效地计算文档相似度相关联的系统,方法,媒体和其他实施例。一个示例性系统实施例包括从字符串产生语法的逻辑和基于识别查询语法和存储在与文档相关的反向索引中的文档语法之间的匹配来识别候选文档的逻辑。示例系统还可以包括用于从倒排索引中的条目选择性地部分地重建候选文档的逻辑以及用于计算与查询相关联的字符串与与部分重建的候选文档相关联的字符串之间的编辑距离的逻辑。示例系统还可包括被配置为提供与编辑距离相对应的信号的信号逻辑。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号