首页> 外国专利> Method of searching for document data files based on keywords,and computer system and computer program thereof

Method of searching for document data files based on keywords,and computer system and computer program thereof

机译:基于关键词的文件数据文件搜索方法,计算机系统及计算机程序

摘要

Disclosed is a method of searching for document data files based on keywords. The method comprises the steps of calculating a score or probability as a first vector that respective document data files are associated with clusters or classes intended for the clustering or classification of document data files; calculating a score or probability as a second vector in response to keywords entered in searches that either the keywords thus entered or keywords that are related to the keywords thus entered are associated with the clusters or classes; calculating the scalar product of the first vector and the second vector, wherein the scalar product value thus calculated is the score of the document data files with respect to the keywords; and finding the correlation value of document data files containing the respective classification keyword sets and of document data files whose calculated score is either greater than or equal to a prescribed threshold or are included in a higher-order prescribed proportion.
机译:公开了一种基于关键字搜索文档数据文件的方法。该方法包括以下步骤:将各个文档数据文件与旨在用于文档数据文件的聚类或分类的簇或类相关联的分数或概率计算为第一矢量;响应于在搜索中输入的关键字,计算得分或概率作为第二向量,其中,输入的关键字或与输入的关键字相关的关键字与聚类或类别相关联;计算第一矢量和第二矢量的标量积,其中,计算得到的标量积值是文档数据文件相对于关键词的得分;找到包含各个分类关键词集的文档数据文件和其计算分数大于或等于规定阈值或以较高顺序的规定比例包含的文献数据文件的相关值。

著录项

  • 公开/公告号GB2488925A

    专利类型

  • 公开/公告日2012-09-12

    原文格式PDF

  • 申请/专利权人 INTERNATIONAL BUSINESS MACHINES CORPORATION;

    申请/专利号GB20120009093

  • 发明设计人 TAKESHI INAGAKI;

    申请日2010-09-10

  • 分类号G06F17/30;

  • 国家 GB

  • 入库时间 2022-08-21 17:03:19

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号