...
首页> 外文期刊>EURASIP journal on advances in signal processing >A new bigram-PLSA language model for speech recognition
【24h】

A new bigram-PLSA language model for speech recognition

机译:用于语音识别的新型bigram-PLSA语言模型

获取原文
获取原文并翻译 | 示例

摘要

A novel method for combining bigram model and Probabilistic Latent Semantic Analysis (PLSA) is introduced for language modeling. The motivation behind this idea is the relaxation of the bag of words assumption fundamentally present in latent topic models including the PLSA model. An EM-based parameter estimation technique for the proposed model is presented in this paper. Previous attempts to incorporate word order in the PLSA model are surveyed and compared with our new proposed model both in theory and by experimental evaluation. Perplexity measure is employed to compare the effectiveness of recently introduced models with the new proposed model. Furthermore, experiments are designed and carried out on continuous speech recognition (CSR) tasks using word error rate (WER) as the evaluation criterion. The superiority of the new bigram-PLSA model over Nie et al.'s bigram-PLSA and simple PLSA models is demonstrated in the results of our experiments. Experiments on BLLIP WSJ corpus show about 12 reduction in perplexity and 2.8 WER improvement compared to Nie et al.'s bigram-PLSA model.
机译:介绍了一种将bigram模型与概率潜在语义分析(PLSA)相结合的新方法进行语言建模。这个想法背后的动机是放宽从根本上存在于潜在主题模型(包括PLSA模型)中的单词假设。本文提出了一种基于EM的参数估计技术。调查了先前尝试将词序纳入PLSA模型的尝试,并在理论上和通过实验评估将其与我们新提出的模型进行了比较。困惑度度量用于比较最近引入的模型与新提出的模型的有效性。此外,设计并以单词错误率(WER)为评估标准,对连续语音识别(CSR)任务进行了实验。实验结果表明,新的bigram-PLSA模型优于Nie等人的bigram-PLSA模型和简单的PLSA模型。与Nie等人的bigram-PLSA模型相比,在BLLIP WSJ语料库上进行的实验表明,困惑度降低了约12,WER改善了2.8。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号