首页> 外国专利> KEYWORD EXTRACTING METHOD BASED ON STATISTICAL METHODS

KEYWORD EXTRACTING METHOD BASED ON STATISTICAL METHODS

机译:基于统计方法的关键词抽取方法

摘要

The present invention relates to a method for extracting main keywords. The method includes the steps of: extracting words from documents to be analyzed, by using a text mining method; generating a document-word matrix by using the extracted words; generating a document-main component analysis matrix by using the document-word matrix generated in advance through a main component analysis; generating a regression model by using the document-main component analysis matrix; selecting the main component corresponding to a parameter with a significance probability (p-value) equal to or less than a first threshold value from the parameters of the generated regression model; and selecting one or more words in the selected main component as the main keywords. Accordingly, the present invention can objectively extract the main keywords, based on a statistical method.
机译:本发明涉及一种提取主关键词的方法。该方法包括以下步骤:通过使用文本挖掘方法从要分析的文档中提取单词;通过使用提取的单词生成文档单词矩阵;通过预先通过主成分分析生成的文档词矩阵,生成文档主成分分析矩阵;通过使用文档主要成分分析矩阵生成回归模型;从所产生的回归模型的参数中选择与具有等于或小于第一阈值的显着概率(p值)的参数相对应的主要成分;并在所选主要组成部分中选择一个或多个单词作为主要关键字。因此,本发明可以基于统计方法客观地提取主要关键词。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号