首页> 外国专利> keyword extraction method and apparatus for science document

keyword extraction method and apparatus for science document

机译:科技文献关键词提取方法及装置

摘要

According to the present invention, a method for extracting keywords from a scientific document comprises the following steps of: receiving a scientific document and converting the same into a morpheme-analyzed scientific document consisting of words analyzed as noun, adjective, and verb through morpheme analysis; constructing a document graph by assigning a main line representing a relationship between words by using words of the scientific document as a vertices; calculating importance scores for the vertices in the document graph; detecting keyword candidates and location information of extracted keyword candidates from the morpheme-analyzed scientific document; calculating scores of words included in a keyword candidate for each of the keyword candidates and a score of a keyword candidate according to the length of the keyword candidate; reranking ranking of keyword candidates by changing a score of each of the keyword candidates according to the location information of a keyword candidate; and determining the predetermined number of highly-ranked keyword candidates among reranked keyword candidates as keywords of the scientific document.
机译:根据本发明,一种从科学文献中提取关键词的方法包括以下步骤:接收科学文献并将其转换为词素分析的科学文献,该科学文献包括通过词素分析被分析为名词,形容词和动词的词;通过使用科学文献的单词作为顶点来分配表示单词之间的关系的主线来构造文档图;计算文档图中顶点的重要性得分;从语素分析的科学文献中检测关键词候选者和提取的关键词候选者的位置信息;根据每个候选关键词的长度,计算每个候选关键词中包含在候选关键词中的单词的分数和候选关键词的分数;通过根据关键字候选者的位置信息改变每个关键字候选者的分数来重新排序关键字候选者的排名;在重新排名的关键词候选中确定预定数量的高度排名的关键词候选作为科学文档的关键词。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号