首页> 外文OA文献 >A Novel Document Generation Process for Topic Detection Based on Hierarchical Latent Tree Models

【2h】

A Novel Document Generation Process for Topic Detection Based on Hierarchical Latent Tree Models

机译：基于分层潜在树模型的主题检测的新文档生成过程

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In most probabilistic topic models, a document is viewed as a collection oftokens and each token is a variable whose values are all the words in avocabulary. One exception is hierarchical latent tree models (HLTMs), where adocument is viewed as a binary vector over the vocabulary and each word isregarded as a binary variable. The use of word variables allows the detectionand representation of patterns of word co-occurrences and co-occurrences ofthose patterns qualitatively using multiple levels of latent variables, andnaturally leads to a method for hierarchical topic detection. In this paper, weassume that an HLTM has been learned from binary data and we extend it to takeword frequencies into consideration. The idea is to replace each binary wordvariable with a real-valued variable that represents the relative frequency ofthe word in a document. A document generation process is proposed and analgorithm is given for estimating the model parameters by inverting thegeneration process. Empirical results show that our method significantlyoutperforms the commonly-used LDA-based methods for hierarchical topicdetection, in terms of model quality and meaningfulness of topics and topichierarchies.

机译：在大多数概率主题模型中，文档被视为一个集合，每个令牌都是一个变量，其值是Avoculary中的所有单词。一个例外是分层潜在树模型（HLTMS），其中Adocument被视为词汇表的二进制向量，并且每个单词都被视为二进制变量。单词变量的使用允许定性地使用多个级别的潜变量来检测单词共同发生模式和组件模式，并使用多个级别的潜变量，并对分层主题检测的方法进行定性。在本文中，Weassume从二进制数据中学习了HLTM，我们将其扩展为抛弃码频率考虑。这个想法是用一个实际值变量替换每个二进制字变量，该变量表示文档中的单词的相对频率。提出了一种文档生成过程，并且通过反转成本过程来提供用于估计模型参数的分析。实证结果表明，在模型质量和主题和拓扑结构的有意义方面，我们的方法明显地表明了基于常用的基于LDA的方法进行了分层主题。

著录项

作者
Peixian Chen; Zhourong Chen; Nevin L. Zhang;
展开▼
作者单位

展开▼
年度 2019
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. Latent tree models for hierarchical topic detection [J] . Chen Peixian, Zhang Nevinlianwen, Liu Tengfei, Artificial intelligence . 2017,第sepa期

机译：潜在树模型用于分层主题检测
2. Cross-language information retrieval models based on latent topic models trained with document-aligned comparable corpora [J] . Ivan Vulić, Wim De Smet, Marie-Francine Moens Information Retrieval . 2013,第3期

机译：基于潜在主题模型的跨语言信息检索模型，该主题模型经过与文档对齐的可比语料库训练
3. Cross-language information retrieval models based on latent topic models trained with document-aligned comparable corpora [J] . Ivan Vulic, Wim De Smet, Marie-Francine Moens Information retrieval . 2013,第3期

机译：基于潜在主题模型的跨语言信息检索模型，该主题模型经过与文档对齐的可比语料库训练
4. A Novel Document Generation Process for Topic Detection Based on Hierarchical Latent Tree Models [C] . Peixian Chen, Zhourong Chen, Nevin L. Zhang European conference on symbolic and quantitative approaches to reasoning with uncertainty . 2019

机译：基于分层潜树模型的主题检测新文档生成过程
5. Latent Tree Analysis for Hierarchical Topic Detection: Scalability and Count Data [D] . Chen, Peixian. 2017

机译：用于分层主题检测的潜在树分析：可伸缩性和计数数据
6. Microblog Topic-Words Detection Model for Earthquake Emergency Responses Based on Information Classification Hierarchy [O] . Xiaohui Su, Shurui Ma, Xiaokang Qiu, 2021

机译：基于信息分类层次结构的地震应急响应的微博主题词检测模型
7. Latent Tree Models for Hierarchical Topic Detection [O] . Chen, Peixian, Zhang, Nevin L., Liu, Tengfei, 2016

机译：用于分层主题检测的潜在树模型

A Novel Document Generation Process for Topic Detection Based on Hierarchical Latent Tree Models

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅