首页> 外国专利> METHOD AND SYSTEM FOR DOCUMENT CLASSIFICATION OR SEARCH USING DISCRETE WORDS

METHOD AND SYSTEM FOR DOCUMENT CLASSIFICATION OR SEARCH USING DISCRETE WORDS

机译:使用离散词进行文档分类或搜索的方法和系统

摘要

A method of operating a computerized document search system where information is matched against a database containing documents in response to user queries includes receiving a query identifying a source document that has information content related to the documents within the database. Important words within the source document are detected automatically, where at least one of the important words has been processed using at least two dictionary functions consisting of Derived Words, Acronym, Word Capitalization, and Hyphenation. An importance value is generated for important words in a processed document using a WordRatio and at least one of a selected set of values. A score is generated for a processed document based partly on the importance value of at least one important word in that document. A document list is created for identifying documents that are related to a source document.
机译:一种操作计算机化文档搜索系统的方法,其中响应于用户查询,将信息与包含文档的数据库进行匹配,该方法包括接收标识源文档的查询,该源文档具有与数据库内的文档有关的信息内容。自动检测源文档中的重要单词,其中至少使用包含衍生单词,首字母缩写词,单词大写字母和断字的两个词典功能处理了至少一个重要单词。使用WordRatio和一组选定值中的至少一个为处理后的文档中的重要单词生成重要性值。部分基于该文档中至少一个重要单词的重要性值,为处理后的文档生成一个分数。创建文档列表以标识与源文档相关的文档。

著录项

  • 公开/公告号US2013144874A1

    专利类型

  • 公开/公告日2013-06-06

    原文格式PDF

  • 申请/专利权人 NEXTGEN DATACOM INC.;

    申请/专利号US201213682642

  • 发明设计人 FRANK R. KOPERDA;TAMARA E. KOPERDA;

    申请日2012-11-20

  • 分类号G06F17/30;

  • 国家 US

  • 入库时间 2022-08-21 16:46:59

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号