首页> 外文会议>European conference on machine learning and knowledge discovery in databases >Scalable Moment-Based Inference for Latent Dirichlet Allocation
【24h】

Scalable Moment-Based Inference for Latent Dirichlet Allocation

机译:潜在狄利克雷分配的基于可伸缩矩的推理

获取原文

摘要

Topic models such as Latent Dirichlet Allocation have been useful text analysis methods of wide interest. Recently, moment-based inference with provable performance has been proposed for topic models. Compared with inference algorithms that approximate the maximum likelihood objective, moment-based inference has theoretical guarantee in recovering model parameters. One such inference method is tensor orthogonal decomposition, which requires only mild assumptions for exact recovery of topics. However, it suffers from scalability issue due to creation of dense, high-dimensional tensors. In this work, we propose a speedup technique by leveraging the special structure of the tensors. It is efficient in both time and space, and only requires scanning the corpus twice. It improves over the state-of-the-art inference algorithm by one to three orders of magnitude, while preserving equal inference ability.
机译:诸如潜在狄利克雷分配等主题模型已成为广泛关注的有用的文本分析方法。最近,针对主题模型提出了具有可证明性能的基于矩的推理。与近似最大似然目标的推理算法相比,基于矩的推理在恢复模型参数方面具有理论上的保证。一种这样的推理方法是张量正交分解,它只需要适度的假设就可以准确地恢复主题。但是,由于创建密集的高维张量,因此存在可伸缩性问题。在这项工作中,我们提出了一种利用张量的特殊结构的加速技术。它在时间和空间上都是高效的,并且只需要扫描语料库两次即可。在保持相同的推理能力的同时,它比最新的推理算法提高了一个到三个数量级。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号