首页> 外文期刊>International journal of information systems in the service sector >Mining Keywords from Short Text Based on LDA-Based Hierarchical Semantic Graph Model
【24h】

Mining Keywords from Short Text Based on LDA-Based Hierarchical Semantic Graph Model

机译:基于LDA的分层语义图模型从短文本中挖掘关键词

获取原文
获取原文并翻译 | 示例
       

摘要

Extracting keywords from a text set is an important task. Most of the previous studies extract keywords from a single text. Using the key topics in the text collection, the association relationship between the topic and the topic in the cross-text, and the association relationship between the words and the words in the cross-text has not played an important role in the previous method of extracting keywords from the text collection. In order to improve the accuracy of extracting keywords from text collections, using the semantic relationship between topics and topics in texts and highlighting the semantic relationship between words and words under the key topics, this article proposes an unsupervised method for mining keywords from short text collections. In this method, a two level semantic association model is used to link the semantic relations between topics and the semantic relations between words, and extract the key words based on the combined action. First, the text is represented with LDA; the authors used word2vec to calculate the semantic association between topic and topic, and build a semantic relation graph between topics, that is the upper level graph, and use a graph ranking algorithm to calculate each topic score. In the lower layer, the semantic association between words and words is calculated by using the topic scores and the relationship between topics in the upper network allow a graph to be constructed. Using a graph sorting algorithm sorts the words in short text sets to determine the keywords. The experimental results show that the method is better for extracting keywords from the text set, especially in short articles. In the text, the important topics, the relationship between topics and the correlation between words can improve the accuracy of extracting keywords from the text set.
机译:从文本集中提取关键字是一项重要任务。以前的大多数研究都从单个文本中提取关键字。使用文本集合中的关键主题,交叉文本中主题与主题之间的关联关系以及交叉文本中单词与单词之间的关联关系在以前的方法中没有发挥重要作用。从文本集中提取关键字。为了提高从文本集合中提取关键词的准确性,利用文本中主题与主题之间的语义关系并突出显示关键主题下的词与词之间的语义关系,本文提出了一种无监督的从短文本集合中提取关键词的方法。 。该方法采用两级语义关联模型,将主题之间的语义关系与单词之间的语义关系联系起来,并基于组合动作提取关键词。首先,文字用LDA表示;作者使用word2vec计算主题与主题之间的语义关联,并建立主题之间的语义关系图,即上层图,并使用图排名算法计算每个主题得分。在较低层中,通过使用主题得分来计算单词与单词之间的语义关联,并且高层网络中主题之间的关系允许构建图。使用图排序算法对短文本集中的单词进行排序,以确定关键字。实验结果表明,该方法较好地从文本集中提取关键词,特别是在短文中。在文本中,重要主题,主题之间的关系以及词之间的相关性可以提高从文本集中提取关键字的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号