Additive regularization of topic models

Vorontsov Konstantin; Potapenko Anna

首页> 外文期刊>Machine Learning >Additive regularization of topic models

【24h】

Additive regularization of topic models

机译：主题模型的可加正则化

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Probabilistic topic modeling of text collections has been recently developed mainly within the framework of graphical models and Bayesian inference. In this paper we introduce an alternative semi-probabilistic approach, which we call additive regularization of topic models (ARTM). Instead of building a purely probabilistic generative model of text we regularize an ill-posed problem of stochastic matrix factorization by maximizing a weighted sum of the log-likelihood and additional criteria. This approach enables us to combine probabilistic assumptions with linguistic and problem-specific requirements in a single multi-objective topic model. In the theoretical part of the work we derive the regularized EM-algorithm and provide a pool of regularizers, which can be applied together in any combination. We show that many models previously developed within Bayesian framework can be inferred easier within ARTM and in some cases generalized. In the experimental part we show that a combination of sparsing, smoothing, and decorrelation improves several quality measures at once with almost no loss of the likelihood.

机译：文本集合的概率主题建模最近主要在图形模型和贝叶斯推理的框架内开发。在本文中，我们介绍了另一种半概率方法，称为主题模型的加法正则化（ARTM）。与其建立纯概率生成的文本模型，不如通过最大化对数似然和其他准则的加权和来规范不适定的随机矩阵分解问题。这种方法使我们能够在单个多目标主题模型中将概率假设与语言和特定问题的需求相结合。在工作的理论部分，我们推导了正则化的EM算法并提供了一个正则化器池，可以将它们以任何组合方式一起应用。我们证明，先前在Bayes框架内开发的许多模型可以在ARTM中更容易地推断出，并且在某些情况下可以推广。在实验部分，我们显示了稀疏，平滑和去相关的组合可立即改善几种质量度量，而几乎不会损失任何可能性。

著录项

来源
《Machine Learning》 |2015年第3期|303-323|共21页
作者
Vorontsov Konstantin; Potapenko Anna;
展开▼
作者单位

RAS, Inst Phys & Technol, Dept Intelligent Syst, Dorodnicyn Comp Ctr, Moscow 117901, Russia;

Higher Sch Econ, Dept Comp Sci, Moscow, Russia;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Probabilistic topic modeling; Regularization of ill-posed problems; Probabilistic latent sematic analysis; Latent Dirichlet allocation; EM-algorithm;

机译：概率主题建模;不适定问题的正则化;概率潜在语义分析;潜在Dirichlet分配;EM算法;

相似文献

外文文献
中文文献
专利

1. Additive Regularization for Topic Models of Text Collections [J] . K. V. Vorontsov Doklady. Mathematics . 2014,第3期

机译：文本集合主题模型的加性正则化
2. Affinity Regularized Non-Negative Matrix Factorization for Lifelong Topic Modeling [J] . Chen Yong, Wu Junjie, Lin Jianying, IEEE Transactions on Knowledge and Data Engineering . 2020,第7期

机译：终身主题建模的亲和力正则非负矩阵分解
3. Efficient algorithms for graph regularized PLSA for probabilistic topic modeling [J] . Wang Xin, Chang Ming-Ching, Wang Lan, Pattern Recognition: The Journal of the Pattern Recognition Society . 2019,第期

机译：用于概率主题建模的图形正则化PLSA的高效算法
4. Additive Regularization of Topic Models for Topic Selection and Sparse Factorization [C] . Konstantin Vorontsov, Anna Potapenko, Alexander Plavin Statistical Learning and Data Sciences . 2015

机译：用于主题选择和稀疏因子分解的主题模型的加性正则化
5. Regularized Estimation for Nonlinear Index Models and Nonlinear Additive Models [D] . Chen, Guangde 2013

机译：非线性指标模型和非线性加法模型的正则估计
6. Regularization for Generalized Additive Mixed Models by Likelihood-Based Boosting [O] . Andreas Groll, Gerhard Tutz -1

机译：通过基于可能的促进促进概括添加剂混合模型的正则化
7. Tutorial on Probabilistic Topic Modeling: Additive Regularization for Stochastic Matrix Factorization [O] . Konstantin Vorontsov, Anna Potapenko 2014

机译：概率主题建模教程：随机矩阵分解的加法正则化

Additive regularization of topic models

摘要

著录项

相似文献

相关主题

期刊订阅