首页> 外文会议>IEEE Workshop on Automatic Speech Recognition and Understanding >Refine bigram PLSA model by assigning latent topics unevenly
【24h】

Refine bigram PLSA model by assigning latent topics unevenly

机译:通过不均匀地分配潜在主题来精确炼大轮型PLSA模型

获取原文

摘要

As an important component in many speech and language processing applications, statistical language model has been widely investigated. The bigram topic model, which combines advantages of both the traditional n-gram model and the topic model, turns out to be a promising language modeling approach. However, the original bigram topic model assigns the same topic number for each context word but ignores the fact that there are different complexities to the latent semantics of context words. we present a new bigram topic model, the bigram PLSA model, and propose a modified training strategy that unevenly assigns latent topics to context words according to an estimation of their latent semantic complexities. As a consequence, a refined bigram PLSA model is reached. Experiments on HUB4 Mandarin test transcriptions reveal the superiority over existing models and further performance improvements on perplexity are achieved through the use of the refined bigram PLSA model.
机译:作为许多语音和语言处理应用中的重要组成部分,统计语言模型已被广泛调查。 BIGRAM主题模型结合了传统的N-GRAM模型和主题模型的优势,结果是一种有希望的语言建模方法。但是,原始BIGRAM主题模型为每个上下文字分配相同的主题编号,但忽略了对上下文单词的潜在语义存在不同的复杂性。我们展示了一个新的Bigram主题模型,Bigram PLSA模型,并提出了一种修改的培训策略,根据其潜在语义复杂性的估计,不均匀地分配潜在主题。因此,达到了精炼的BigRam PLSA模型。 Hub4普通话试验转录的实验揭示了现有模型的优越性,通过使用精制的Bigram PLSA模型实现了对困惑的进一步性能改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号