首页> 外文会议>ACM SIGKDD international conference on knowledge discovery and data mining;KDD 10 >Evolutionary Hierarchical Dirichlet Processes for Multiple Correlated Time-varying Corpora
【24h】

Evolutionary Hierarchical Dirichlet Processes for Multiple Correlated Time-varying Corpora

机译:多个相关时变语料库的进化递阶Dirichlet过程

获取原文

摘要

Mining cluster evolution from multiple correlated time-varying text corpora is important in exploratory text analytics. In this paper, we propose an approach called evolutionary hierarchical Dirichlet processes (EvoHDP) to discover interesting cluster evolution patterns from such text data. We formulate the EvoHDP as a series of hierarchical Dirichlet processes (HDP) by adding time dependencies to the adjacent epochs, and propose a cascaded Gibbs sampling scheme to infer the model. This approach can discover different evolving patterns of clusters, including emergence, disappearance, evolution within a corpus and across different corpora. Experiments over synthetic and real-world multiple correlated time-varying data sets illustrate the effectiveness of EvoHDP on discovering cluster evolution patterns.
机译:在探索性文本分析中,从多个相关的时变文本语料库中挖掘群集演化非常重要。在本文中,我们提出了一种称为进化层次Dirichlet进程(EvoHDP)的方法,以从此类文本数据中发现有趣的聚类演化模式。通过将时间依赖性添加到相邻历元,我们将EvoHDP公式化为一系列层次化的Dirichlet流程(HDP),并提出了级联的Gibbs采样方案来推断模型。这种方法可以发现聚类的不同演化模式,包括出现,消失,语料库内以及不同语料库之间的进化。在合成和现实世界中多个相关的随时间变化的数据集上进行的实验说明了EvoHDP在发现集群演化模式方面的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号