Latent tree models for hierarchical topic detection

Chen Peixian; Zhang Nevinlianwen; Liu Tengfei; Poon Leonard K.M.; Chen Zhourong; Khawar Farhan

首页> 外文期刊>Artificial intelligence >Latent tree models for hierarchical topic detection

【24h】

Latent tree models for hierarchical topic detection

机译：潜在树模型用于分层主题检测

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We present a novel method for hierarchical topic detection where topics are obtained by clustering documents in multiple ways. Specifically, we model document collections using a class of graphical models called hierarchical latent tree models (HLTMs). The variables at the bottom level of an HLTM are observed binary variables that represent the presence/absence of words in a document. The variables at other levels are binary latent variables that represent word co-occurrence patterns or co-occurrences of such patterns. Each latent variable gives a soft partition of the documents, and document clusters in the partitions are interpreted as topics. Latent variables at high levels of the hierarchy capture long-range word co-occurrence patterns and hence give thematically more general topics, while those at low levels of the hierarchy capture short-range word co-occurrence patterns and give thematically more specific topics. In comparison with LDA-based methods, a key advantage of the new method is that it represents co-occurrence patterns explicitly using model structures. Extensive empirical results show that the new method significantly outperforms the LDA-based methods in term of model quality and meaningfulness of topics and topic hierarchies.

机译：我们提出了一种用于分层主题检测的新颖方法，其中通过以多种方式对文档进行聚类来获取主题。具体来说，我们使用称为分层潜伏树模型（HLTM）的一类图形模型对文档集合进行建模。 HLTM最底层的变量是观察到的二进制变量，代表文档中单词的存在/不存在。其他级别的变量是二进制潜在变量，它们表示单词共现模式或此类模式的共现。每个潜在变量都给文档提供了一个软分区，并且分区中的文档簇被解释为主题。层次结构较高级别的潜在变量捕获了远程单词共现模式，因此在主题上提供了更广泛的主题，而层次结构较低级别的潜在变量则捕获了短期单词共现模式，并在主题上给出了更具体的主题。与基于LDA的方法相比，新方法的主要优势在于它使用模型结构显式表示共现模式。大量的经验结果表明，在模型质量以及主题和主题层次结构的意义上，新方法明显优于基于LDA的方法。

著录项

来源
《Artificial intelligence》 |2017年第9期|105-124|共20页
作者
Chen Peixian; Zhang Nevinlianwen; Liu Tengfei; Poon Leonard K.M.; Chen Zhourong; Khawar Farhan;
展开▼
作者单位

Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong;

Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong;

Ant Financial Services Group, Shanghai, China;

Department of Mathematics and Information Technology, The Education University of Hong Kong, Hong Kong;

Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong;

Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Hierarchical latent tree analysis; Hierarchical topic detection; Probabilistic graphical models; Text analysis;

机译：分层潜树分析;分层主题检测;概率图形模型;文字分析;

相似文献

外文文献
中文文献
专利

1. A hierarchical latent topic model based on sparse coding [J] . Wenjun Zhu, Liqing Zhang, Qianwei Bian Neurocomputing . 2012,第1期

机译：基于稀疏编码的层次化潜在主题模型
2. Hierarchical Multinomial Processing Tree Models: A Latent-Trait Approach [J] . Karl Christoph Klauer Psychometrika . 2010,第1期

机译：分层多项式处理树模型：一种潜在特质方法
3. HIERARCHICAL MULTINOMIAL PROCESSING TREE MODELS: A LATENT-TRAIT APPROACH [J] . KARL CHRISTOPH KLAUER Psychometrika . 2010,第1期

机译：多层多项式处理树模型：潜在特征方法
4. A Novel Document Generation Process for Topic Detection Based on Hierarchical Latent Tree Models [C] . Peixian Chen, Zhourong Chen, Nevin L. Zhang European conference on symbolic and quantitative approaches to reasoning with uncertainty . 2019

机译：基于分层潜树模型的主题检测新文档生成过程
5. Latent Tree Analysis for Hierarchical Topic Detection: Scalability and Count Data [D] . Chen, Peixian. 2017

机译：用于分层主题检测的潜在树分析：可伸缩性和计数数据
6. Perturbation Detection Through Modeling of Gene Expression on a Latent Biological Pathway Network: A Bayesian hierarchical approach [O] . Lisa M. Pham, Luis Carvalho, Scott Schaus, -1

机译：通过潜在的生物通路网络上的基因表达建模的摄动检测：贝叶斯分级方法。
7. A Novel Document Generation Process for Topic Detection Based on Hierarchical Latent Tree Models [O] . Peixian Chen, Zhourong Chen, Nevin L. Zhang 2019

机译：基于分层潜在树模型的主题检测的新文档生成过程

Latent tree models for hierarchical topic detection

摘要

著录项

相似文献

相关主题

期刊订阅