首页> 外国专利> MACHINE-LEARNED APPROACH TO DETERMINING DOCUMENT RELEVANCE FOR SEARCH OVER LARGE ELECTRONIC COLLECTIONS OF DOCUMENTS

MACHINE-LEARNED APPROACH TO DETERMINING DOCUMENT RELEVANCE FOR SEARCH OVER LARGE ELECTRONIC COLLECTIONS OF DOCUMENTS

机译:确定较大电子文档集合中文档相关性的机器学习方法

摘要

The present invention relates to a system and methodology that applies automated learning procedures for determining document relevance and assisting information retrieval activities. A system is provided that facilitates a machine-learned approach to determine document relevance. The system includes a storage component that receives a set of human selected items to be employed as positive test cases of highly relevant documents. A training component trains at least one classifier with the human selected items as positive test cases and one or more other items as negative test cases in order to provide a query-independent model, wherein the other items can be selected by a statistical search, for example. Also, the trained classifier can be employed to aid an individual in identifying and selecting new positive cases or utilized to filter or re-rank results from a statistical-based search.
机译:本发明涉及一种系统和方法,该系统和方法应用自动学习过程来确定文档相关性并辅助信息检索活动。提供一种有助于机器学习的方法来确定文档相关性的系统。该系统包括一个存储组件,该组件接收一组人工选择的项目,以用作高度相关文档的肯定测试案例。训练组件训练至少一个分类器,其中将人类选择的项作为肯定测试用例,将一个或多个其他项作为否定测试用例,以提供独立于查询的模型,其中其他项可以通过统计搜索来选择,例。同样,训练有素的分类器可用于帮助个人识别和选择新的阳性病例,或用于对基于统计的搜索结果进行过滤或重新排序。

著录项

  • 公开/公告号KR101027864B1

    专利类型

  • 公开/公告日2011-04-07

    原文格式PDF

  • 申请/专利权人

    申请/专利号KR20050001860

  • 申请日2005-01-07

  • 分类号G06F17/30;

  • 国家 KR

  • 入库时间 2022-08-21 17:50:21

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号