首页> 外文会议>Web technologies and applications >A Supervised Parameter Estimation Method of LDA
【24h】

A Supervised Parameter Estimation Method of LDA

机译:LDA的监督参数估计方法

获取原文
获取原文并翻译 | 示例

摘要

Latent Dirichlet Allocation (LDA) probabilistic topic model is a very effective dimension-reduction tool which can automatically extract latent topics and dedicate to text representation in a lower-dimensional semantic topic space. But the original LDA and its most variants are unsupervised without reference to category label of the documents in the training corpus. And most of them view the terms in vocabulary as equally important, but the weight of each term is different, especially for a skewed corpus in which there are many more samples of some categories than others. As a result, we propose a supervised parameter estimation method based on category and document information which can estimate the parameters of LDA according to term weight. The comparative experiments show that the proposed method is superior for the skewed text classification, which can largely improve the recall and precision of the minority category.
机译:潜在狄利克雷分配(LDA)概率主题模型是一种非常有效的降维工具,可以自动提取潜在主题并专用于低维语义主题空间中的文本表示。但是,原始LDA及其大多数变体在没有参考培训语料库中文档类别标签的情况下不受监督。他们中的大多数人认为词汇中的术语同样重要,但是每个术语的权重是不同的,特别是对于偏斜的语料库,其中某些类别的样本比其他样本多。因此,我们提出了一种基于类别和文档信息的监督参数估计方法,该方法可以根据术语权重估计LDA的参数。对比实验表明,该方法在偏文本分类中具有较好的优越性,可以大大提高少数群体类别的查全率和查准率。

著录项

  • 来源
    《Web technologies and applications》|2015年|401-410|共10页
  • 会议地点 Guangzhou(CN)
  • 作者单位

    Institute of Computing Technology, Chinese Academy of Sciences, BeiJing, 100190, China,University of Chinese Academy of Sciences, BeiJing, 100049, China,Institute of Information Engineering, Chinese Academy of Sciences, BeiJing, 100093, China,School of Software, Beijing Institute of Technology, BeiJing, 100081, China;

    Institute of Information Engineering, Chinese Academy of Sciences, BeiJing, 100093, China;

    Institute of Information Engineering, Chinese Academy of Sciences, BeiJing, 100093, China;

    School of Software, Beijing Institute of Technology, BeiJing, 100081, China;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    LDA; parameter estimation; Gibbs sampling; skewed text classification; term weighting;

    机译:LDA;参数估计;吉布斯采样;倾斜的文字分类;术语权重;
  • 入库时间 2022-08-26 14:26:58

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号