首页> 外文会议>2017 IEEE International Conference on Big Knowledge >Multi-label Classification Using Stacked Hierarchical Dirichlet Processes with Reduced Sampling Complexity
【24h】

Multi-label Classification Using Stacked Hierarchical Dirichlet Processes with Reduced Sampling Complexity

机译:使用降低抽样复杂度的堆叠分层Dirichlet过程进行多标签分类

获取原文
获取原文并翻译 | 示例

摘要

Nonparametric topic models based on hierarchical Dirichlet processes (HDPs) allow for the number of topics to be automatically discovered from the data. The computational complexity of standard Gibbs sampling techniques for model training is linear in the number of topics. Recently, it was reduced to be linear in the number of topics per word using a technique called alias-sampling combined with Metropolis Hastings (MH) sampling. We propose a different proposal distribution for the MH step based on the observation that distributions on the upper hierarchy level change slower than the document specific distributions at the lower level. This reduces the sampling complexity, making it linear in the number of topics per document, at unchanged test set log-likelihood. Furthermore, we propose a novel model of stacked HDPs utilizing this sampling method. Experiments demonstrate the effectiveness of the proposed approach in the context of multi-label classification.
机译:基于分层Dirichlet流程(HDP)的非参数主题模型允许从数据中自动发现主题的数量。用于模型训练的标准Gibbs采样技术的计算复杂度在主题数量上呈线性关系。最近,使用称为别名采样与Metropolis Hastings(MH)采样相结合的技术,将每个单词的主题数减少为线性。基于以下观察结果,我们为MH步骤提出了不同的建议分布:在较高层次上的分布变化比在较低层上的文档特定分布变化慢。这降低了采样复杂度,使每个文档的主题数呈线性关系,而测试集的对数可能性不变。此外,我们提出了一种利用这种采样方法的新型堆叠HDP模型。实验证明了该方法在多标签分类中的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号