首页> 外文期刊>Pattern Analysis and Machine Intelligence, IEEE Transactions on >Hierarchical Bayesian Modeling of Topics in Time-Stamped Documents
【24h】

Hierarchical Bayesian Modeling of Topics in Time-Stamped Documents

机译:带时间戳的文档中主题的分层贝叶斯建模

获取原文
获取原文并翻译 | 示例

摘要

We consider the problem of inferring and modeling topics in a sequence of documents with known publication dates. The documents at a given time are each characterized by a topic and the topics are drawn from a mixture model. The proposed model infers the change in the topic mixture weights as a function of time. The details of this general framework may take different forms, depending on the specifics of the model. For the examples considered here, we examine base measures based on independent multinomial-Dirichlet measures for representation of topic-dependent word counts. The form of the hierarchical model allows efficient variational Bayesian inference, of interest for large-scale problems. We demonstrate results and make comparisons to the model when the dynamic character is removed, and also compare to latent Dirichlet allocation (LDA) and Topics over Time (TOT). We consider a database of Neural Information Processing Systems papers as well as the US Presidential State of the Union addresses from 1790 to 2008.
机译:我们考虑在具有已知出版日期的一系列文档中推断和建模主题的问题。给定时间的文档均以主题为特征,并且这些主题均来自混合模型。所提出的模型推断主题混合物权重随时间的变化。取决于模型的细节,该通用框架的细节可以采用不同的形式。对于此处考虑的示例,我们研究了基于独立多项式-狄利克雷特度量的基本度量,用于表示与主题相关的字数。分层模型的形式允许进行有效的变分贝叶斯推理,这是大规模问题所关注的。我们演示了结果,并在删除了动态字符后与模型进行了比较,还与潜在的狄利克雷分配(LDA)和随时间变化的主题(TOT)进行了比较。我们考虑了神经信息处理系统论文的数据库以及1790年至2008年美国总统国情咨文。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号