【24h】

Online Multiscale Dynamic Topic Models

机译:在线多尺度动态主题模型

获取原文

摘要

We propose an online topic model for sequentially analyzing the time evolution of topics in document collections. Topics naturally evolve with multiple timescales. For example, some words may be used consistently over one hundred years, while other words emerge and disappear over periods of a few days. Thus, in the proposed model, current topic-specific distributions over words are assumed to be generated based on the multiscale word distributions of the previous epoch. Considering both the long-timescale dependency as well as the short-timescale dependency yields a more robust model. We derive efficient online inference procedures based on a stochastic EM algorithm, in which the model is sequentially updated using newly obtained data; this means that past data are not required to make the inference. We demonstrate the effectiveness of the proposed method in terms of predictive performance and computational efficiency by examining collections of real documents with timestamps.
机译:我们提出了一个在线主题模型,用于顺序分析文档集中主题的时间演变。主题自然会随着多个时间尺度而演变。例如,某些单词可能会在一百年内持续使用,而另一些单词会在几天内消失。因此,在提出的模型中,假设基于上一个纪元的多尺度词分布生成当前的特定于主题的词分布。同时考虑长时标依赖关系和短时标依赖关系会产生一个更健壮的模型。我们基于随机EM算法得出有效的在线推理程序,其中使用新获得的数据顺序更新模型;这意味着不需要过去的数据进行推断。通过检查带有时间戳的真实文档的集合,我们证明了该方法在预测性能和计算效率方面的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号