Online Sparse Collapsed Hybrid Variational-Gibbs Algorithm for Hierarchical Dirichlet Process Topic Models

机译：递阶Dirichlet过程主题模型的在线稀疏折叠混合变分Gibbs算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Topic models for text analysis are most commonly trained using either Gibbs sampling or variational Bayes. Recently, hybrid va-riational-Gibbs algorithms have been found to combine the best of both worlds. Variational algorithms are fast to converge and more efficient for inference on new documents. Gibbs sampling enables sparse updates since each token is only associated with one topic instead of a distribution over all topics. Additionally, Gibbs sampling is unbiased. Although Gibbs sampling takes longer to converge, it is guaranteed to arrive at the true posterior after infinitely many iterations. By combining the two methods it is possible to reduce the bias of variational methods while simultaneously speeding up variational updates. This idea has previously been applied to standard latent Dirichlet allocation (LDA). We propose a new sampling method that enables the application of the idea to the nonparametric version of LDA, hierarchical Dirichlet process topic models. Our fast sampling method leads to a significant speedup of variational updates as compared to other sampling methods. Experiments show that training of our topic model converges to a better log-likelihood than previously existing variational methods and converges faster than Gibbs sampling in the batch setting.

机译：使用Gibbs采样或变分贝叶斯最常训练用于文本分析的主题模型。最近，已经发现混合可变Gibbs算法结合了两个方面的优势。变式算法可以快速收敛，并且在推断新文档时效率更高。 Gibbs采样使稀疏更新成为可能，因为每个令牌仅与一个主题相关联，而不是与所有主题相关联。此外，Gibbs采样是无偏的。尽管Gibbs采样收敛所需的时间更长，但可以保证在无数次迭代后到达真实的后验。通过将两种方法结合起来，可以减少变化方法的偏差，同时加快变化更新的速度。此想法以前已应用于标准潜在Dirichlet分配（LDA）。我们提出了一种新的采样方法，该方法使该思想能够应用于LDA的非参数版本，即分层Dirichlet过程主题模型。与其他采样方法相比，我们的快速采样方法可显着加快变体更新的速度。实验表明，与以前的变分方法相比，对主题模型的训练收敛到更好的对数似然性，并且在批处理设置中，收敛速度比Gibbs采样快。

著录项

来源
《European conference on machine learning and principles and practice of knowledge discovery in databases》|2017年|189-204|共16页
会议地点
作者
Sophie Burkhardt; Stefan Kramer;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
入库时间 2022-08-26 14:32:57

相似文献

外文文献
中文文献
专利

1. Hierarchical topic modeling with nested hierarchical Dirichlet process [J] . Yi-qun DING, Shan-ping LI, Zhen ZHANG, Journal of Zhejiang University. Science, A . 2009,第6期

机译：使用嵌套分层Dirichlet进程建模的分层主题
2. Hierarchical topic modeling with nested hierarchical Dirichlet process [J] . Yi-qun DING, Shan-ping LI, Zhen ZHANG, 浙江大学学报（英文版）（A辑：应用物理和工程） . 2009,第006期

机译：嵌套层次Dirichlet过程的层次主题建模
3. Non-Parametric Estimation of Topic Hierarchies from Texts with Hierarchical Dirichlet Processes [J] . Zavitsanos Elias, Paliouras Georgios, Vouros George A. Journal of machine learning research . 2011,第Oct期

机译：具有分层Dirichlet过程的文本的非参数估计主题层次结构
4. Online Sparse Collapsed Hybrid Variational-Gibbs Algorithm for Hierarchical Dirichlet Process Topic Models [C] . Sophie Burkhardt, Stefan Kramer European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases . 2017

机译：在线稀疏折叠混合变分 - GIBBS分层Dirichlet过程主题模型算法
5. Analyzing Fetal Heart Rate by Models Based on Hierarchical Dirichlet Process [D] . Yu, Kezi. 2018

机译：基于分层Direichlet过程的模型分析胎儿心率
6. Dynamic classification of fetal heart rates by hierarchical Dirichlet process mixture models [O] . Kezi Yu, J. Gerald Quirk, Petar M. Djurić 2011

机译：通过分层Dirichlet过程混合模型对胎儿心率进行动态分类
7. Modeling topic and role information in meetings using the hierarchical Dirichlet process [O] . Songfang Huang, Steve Renals 2008

机译：使用分层Dirichlet流程在会议中建模主题和角色信息

Online Sparse Collapsed Hybrid Variational-Gibbs Algorithm for Hierarchical Dirichlet Process Topic Models

摘要

著录项

相似文献

相关主题

期刊订阅