首页> 外国专利> Real-time keyword extraction method and device in text streaming environment

Real-time keyword extraction method and device in text streaming environment

机译:文本流环境中的实时关键字提取方法和设备

摘要

The present invention relates to a method and apparatus for extracting real-time keywords using a micro-batch processing-based TextRank algorithm. A real-time keyword extraction apparatus according to an embodiment of the present invention includes: a data receiving unit for receiving word data of a first sentence input in a text streaming environment; a storage unit for calculating the input word data of the first sentence, generating a micro table in which an operation value of the word data of the first sentence is stored, and storing the operation value in the generated micro table; a word weight calculator for calculating word weights of words included in the word data using a TF-IDF (Term Frequency-Inverse Document Frequency) algorithm based on the calculation values stored in the micro table; a word node graph generating unit that generates a word node graph based on the calculated word weight; an importance value calculator for calculating importance values of words included in the word data using a PageRank algorithm based on the word weight and the number of adjacent word nodes connected in the word node graph; and a keyword extraction unit for extracting keywords according to the calculated importance value.
机译:本发明涉及使用基于微批处理的Textrank算法提取实时关键字的方法和装置。根据本发明实施例的实时关键字提取装置包括:数据接收单元,用于在文本流环境中接收第一句输入的字数据;用于计算第一句子的输入字数据的存储单元,生成一个微表,其中存储第一句话的单词数据的操作值,并将操作值存储在生成的微表中;一种单词权重计算器,用于使用TF-IDF(术语频率逆文档频率)算法基于存储在微表中的计算值来计算包含在字数据中的单词的单词权重;一个字节点图生成单元,其基于计算的字重量生成单词节点图;一种重要值计算器,用于计算使用PageRank算法基于Word权重和在字节点图中连接的相邻字节点的数量来计算单词数据中包含的单词的重要性值;和关键字提取单元根据计算的重要值提取关键字。

著录项

  • 公开/公告号KR102296931B1

    专利类型

  • 公开/公告日2021-09-01

    原文格式PDF

  • 申请/专利权人 경희대학교 산학협력단;

    申请/专利号KR20190132568

  • 发明设计人 허의남;박재호;

    申请日2019-10-23

  • 分类号G06F40/20;G06F16/35;

  • 国家 KR

  • 入库时间 2024-06-14 22:24:31

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号