首页> 外文期刊>Machine Learning >Additive regularization of topic models
【24h】

Additive regularization of topic models

机译:主题模型的可加正则化

获取原文
获取原文并翻译 | 示例
           

摘要

Probabilistic topic modeling of text collections has been recently developed mainly within the framework of graphical models and Bayesian inference. In this paper we introduce an alternative semi-probabilistic approach, which we call additive regularization of topic models (ARTM). Instead of building a purely probabilistic generative model of text we regularize an ill-posed problem of stochastic matrix factorization by maximizing a weighted sum of the log-likelihood and additional criteria. This approach enables us to combine probabilistic assumptions with linguistic and problem-specific requirements in a single multi-objective topic model. In the theoretical part of the work we derive the regularized EM-algorithm and provide a pool of regularizers, which can be applied together in any combination. We show that many models previously developed within Bayesian framework can be inferred easier within ARTM and in some cases generalized. In the experimental part we show that a combination of sparsing, smoothing, and decorrelation improves several quality measures at once with almost no loss of the likelihood.
机译:文本集合的概率主题建模最近主要在图形模型和贝叶斯推理的框架内开发。在本文中,我们介绍了另一种半概率方法,称为主题模型的加法正则化(ARTM)。与其建立纯概率生成的文本模型,不如通过最大化对数似然和其他准则的加权和来规范不适定的随机矩阵分解问题。这种方法使我们能够在单个多目标主题模型中将概率假设与语言和特定问题的需求相结合。在工作的理论部分,我们推导了正则化的EM算法并提供了一个正则化器池,可以将它们以任何组合方式一起应用。我们证明,先前在Bayes框架内开发的许多模型可以在ARTM中更容易地推断出,并且在某些情况下可以推广。在实验部分,我们显示了稀疏,平滑和去相关的组合可立即改善几种质量度量,而几乎不会损失任何可能性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号