首页> 外国专利> Weighting method for use in information extraction and abstracting, based on the frequency of occurrence of keywords and similarity calculations

Weighting method for use in information extraction and abstracting, based on the frequency of occurrence of keywords and similarity calculations

机译:基于关键字出现频率和相似度计算的信息提取和摘要中的加权方法

摘要

An information abstracting method and apparatus for extracting and displaying keywords as an information abstract. Given a large number of character string data sets divided into prescribed units, the extracted keywords are significant and effective in describing a topic common to the plurality of units. The information abstracting apparatus comprises an input section for accepting an input of character string data divided into prescribed units, with each individual character represented by a character code, and an output section for displaying the result of information abstracting. Keywords contained in each of the prescribed units are extracted by a keyword extracting section from the character string input data from the input section. A score is calculated for each keyword by a score calculating section, so that a higher score is given to a keyword extracted from a larger number of units. On the basis of the calculated scores, keywords are selected by an abstracting section and are outputted as an information abstract by the output section.
机译:一种用于提取和显示关键词作为信息摘要的信息抽象方法和装置。给定将大量字符串数据集划分为预定单位的情况,所提取的关键字对于描述多个单位所共有的主题是有效的。该信息抽象设备包括:输入部分,用于接受被划分为规定单元的字符串数据的输入,每个单独的字符由字符代码表示;以及输出部分,用于显示信息抽象的结果。关键字提取部分从来自输入部分的字符串输入数据中提取包含在每个规定单元中的关键字。由得分计算部分为每个关键词计算得分,从而对从大量单元中提取的关键词给予更高的得分。根据计算出的分数,关键词由抽象部分选择,并由输出部分作为信息摘要输出。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号