首页> 外国专利> Search results ranking using editing distance and document information

Search results ranking using editing distance and document information

机译:使用编辑距离和文档信息对搜索结果进行排名

摘要

Architecture for extracting document information from documents received as search results based on a query string, and computing an edit distance between the data string and the query string. The edit distance is employed in determining relevance of the document as part of result ranking by detecting near-matches of a whole query or part of the query. The edit distance evaluates how close the query string is to a given data stream that includes document information such as TAUC (title, anchor text, URL, clicks) information, etc. The architecture includes the index-time splitting of compound terms in the URL to allow the more effective discovery of query terms. Additionally, index-time filtering of anchor text is utilized to find the top N anchors of one or more of the document results. The TAUC information can be input to a neural network (e.g., 2-layer) to improve relevance metrics for ranking the search results.
机译:用于基于查询字符串从作为搜索结果接收的文档中提取文档信息,并计算数据字符串和查询字符串之间的编辑距离的体系结构。通过检测整个查询或部分查询的接近匹配,将编辑距离用于确定文档的相关性作为结果排名的一部分。编辑距离评估查询字符串与给定数据流的接近程度,该数据流包含文档信息,例如TAUC(标题,锚文本,URL,点击)信息等。该体系结构包括URL中复合词的索引时间拆分以便更有效地发现查询字词。另外,利用锚文本的索引时间过滤来查找一个或多个文档结果的前N个锚。可以将TAUC信息输入到神经网络(例如2层)以改善用于对搜索结果进行排名的相关性度量。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号