首页> 外国专利> System and method for retrieving and intelligently grouping definitions found in a repository of documents

System and method for retrieving and intelligently grouping definitions found in a repository of documents

机译:用于检索和智能地分组存储在文档库中的定义的系统和方法

摘要

A system and method for retrieving and intelligently grouping definitions with common semantic meaning is disclosed. In response to a user's textual query for the definition of a term or phrase, a set of documents is retrieved from a repository of structured documents. The retrieved documents are labeled with a prediction score based upon predetermined glossary characteristics of the documents. In order to determine whether the retrieved documents are likely to be definitions, features commonly found in definitions are identified. The identified features are classified with numeric values and weighed using a support vector regression algorithm. Definitions that fail to meet a predetermined threshold score are discarded, and those that exceed a predetermined threshold score are labeled and stored in the local database.
机译:公开了一种用于检索和智能地对具有共同语义的定义进行分组的系统和方法。响应于用户对术语或短语的定义的文本查询,从结构化文档的存储库中检索出一组文档。基于文档的预定词汇表特征,使用预测分数标记检索到的文档。为了确定所检索的文档是否可能是定义,对定义中常见的特征进行标识。使用数字值对识别出的特征进行分类,并使用支持向量回归算法对其进行加权。不满足预定阈值分数的定义将被丢弃,而超过预定阈值分数的定义将被标记并存储在本地数据库中。

著录项

  • 公开/公告号US2007282780A1

    专利类型

  • 公开/公告日2007-12-06

    原文格式PDF

  • 申请/专利权人 JEFFREY REGIER;URI AVISSAR;

    申请/专利号US20070711227

  • 发明设计人 JEFFREY REGIER;URI AVISSAR;

    申请日2007-02-27

  • 分类号G06F17/20;

  • 国家 US

  • 入库时间 2022-08-21 20:10:44

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号