首页> 外文期刊>Web Intelligence and Agent Systems >Combining semantic graph and probabilistic topic models for discovering coherent topics
【24h】

Combining semantic graph and probabilistic topic models for discovering coherent topics

机译:结合语义图和概率主题模型以发现相干主题

获取原文
获取原文并翻译 | 示例
       

摘要

Probabilistic topic models, which frequently represent topics as multinomial distributions over words, have been extensively used for discovering latent topics in text corpora. However, because topic models are entirely unsupervised, they may lead to topics that are not understandable in applications. Recently, several knowledge-based topic models have been proposed which primarily use word-level domain knowledge in the model to enhance the topic coherence and ignore the rich information carried by entities (e.g, persons, locations, organizations, etc.) associated with the documents. Additionally, there exists a vast amount of prior knowledge (background knowledge) represented as Linked Open Data (LOD) datasets and other ontologies, which can be incorporated into the topic models to produce coherent topics. In this paper, we introduce a novel regularization entity-based topic model (RETM ), which integrates an ontology with an entity-based topic model (EntLDA ) to increase the coherence of the identified topics through the topic modeling process. Our experimental results demonstrate the effectiveness of the proposed model in improving the coherence of topics.
机译:概率主题模型通常将主题表示为单词的多项式分布,已广泛用于发现文本语料库中的潜在主题。但是,由于主题模型是完全不受监督的,因此它们可能导致应用程序中无法理解的主题。最近,已经提出了几种基于知识的主题模型,它们主要在模型中使用单词级领域的知识来增强主题的连贯性,而忽略了与主题相关联的实体(例如,人员,位置,组织等)携带的丰富信息。文件。另外,存在大量表示为链接开放数据(LOD)数据集和其他本体的先验知识(背景知识),可以将其合并到主题模型中以产生一致的主题。在本文中,我们介绍了一种新颖的基于实体的正则化主题模型(RETM),该模型将本体与基于实体的主题模型(EntLDA)集成在一起,以通过主题建模过程提高所识别主题的一致性。我们的实验结果证明了所提出的模型在提高主题连贯性方面的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号