首页> 外文会议>ACMKDD International Conference on Knowledge Discovery and Data Mining;KDD 2008 >Model-Based Document Clustering with a Collapsed Gibbs Sampler
【24h】

Model-Based Document Clustering with a Collapsed Gibbs Sampler

机译:折叠的Gibbs采样器的基于模型的文档聚类

获取原文

摘要

Model-based algorithms are emerging as a preferred method for document clustering. As computing resources improve, methods such as Gibbs sampling have become more common for parameter estimation in these models. Gibbs sampling is well understood for many applications, but has not been extensively studied for use in document clustering. We explore the convergence rate, the possibility of label switching, and chain summarization methodologies for document clustering on a particular model, namely a mixture of multinomials model, and show that fairly simple methods can be employed, while still producing clusterings of superior quality compared to those produced with the EM algorithm.
机译:基于模型的算法正在成为文档聚类的首选方法。随着计算资源的改善,在这些模型中,诸如Gibbs采样之类的方法已变得越来越普遍,可用于参数估计。 Gibbs采样在许多应用中广为人知,但尚未进行广泛研究以用于文档聚类。我们探索了在特定模型(即混合多项式模型)上的文档聚类的收敛速度,标签切换的可能性以及链汇总方法,并表明可以使用相当简单的方法,而与用EM算法产生的那些。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号