首页> 外国专利> Efficient computation of document similarity

Efficient computation of document similarity

机译：高效计算文档相似度

页面导航

摘要
著录项
相似文献

摘要

Systems, methodologies, media, and other embodiments associated with efficiently computing document similarity are described. One exemplary system embodiment includes logic to produce a gram from a string and logic to identify candidate documents based on identifying matches between query grams and document grams stored in an inverted index that relates grams to documents. The example system may also include logic to selectively partially reconstruct a candidate document from entries in the inverted index and logic to compute an edit distance between a string associated with a query and a string associated with the partially reconstructed candidate document. The example system may also include a signal logic configured to provide a signal corresponding to the edit distance.

机译：描述了与有效地计算文档相似度相关联的系统，方法，媒体和其他实施例。一个示例性系统实施例包括从字符串产生语法的逻辑和基于识别查询语法和存储在与文档相关的反向索引中的文档语法之间的匹配来识别候选文档的逻辑。示例系统还可以包括用于从倒排索引中的条目选择性地部分地重建候选文档的逻辑以及用于计算与查询相关联的字符串与与部分重建的候选文档相关联的字符串之间的编辑距离的逻辑。示例系统还可包括被配置为提供与编辑距离相对应的信号的信号逻辑。

著录项

公开/公告号US7610281B2

专利类型
公开/公告日2009-10-27

原文格式PDF
申请/专利权人 RIKIN GANDHI;YASUHIRO MATSUDA;MOHAMMAD FAISAL;
展开▼

申请/专利号US20060606213
发明设计人 RIKIN GANDHI;MOHAMMAD FAISAL;YASUHIRO MATSUDA;
展开▼

申请日2006-11-29
分类号G06F17/30;
国家 US
入库时间 2022-08-21 19:31:18

相似文献

专利
外文文献
中文文献